Upon examining the current implementation for getting/setting SIMD and SVE registers via remote GDB, there is a concern about mixed endian support.
Consider the following snippet from a GDB session in which a SIMD register's value is set via remote GDB where the QEMU host is little endian and the target is big endian: (gdb) p/x $v0 $1 = {d = {f = {0x0, 0x0}, u = {0x0, 0x0}, s = {0x0, 0x0}}, s = {f = {0x0, 0x0, 0x0, 0x0}, u = {0x0, 0x0, 0x0, 0x0}, s = {0x0, 0x0, 0x0, 0x0}}, h = {bf = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, f = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, u = { 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, s = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, b = {u = {0x0 <repeats 16 times>}, s = {0x0 <repeats 16 times>}}, q = {u = {0x0}, s = {0x0}}} (gdb) set $v0.d.u[0] = 0x010203 (gdb) p/x $v0 $2 = {d = {f = {0x302010000000000, 0x0}, u = {0x302010000000000, 0x0}, s = {0x302010000000000, 0x0}}, s = {f = {0x3020100, 0x0, 0x0, 0x0}, u = {0x3020100, 0x0, 0x0, 0x0}, s = {0x3020100, 0x0, 0x0, 0x0}}, h = {bf = {0x302, 0x100, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, f = {0x302, 0x100, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, u = {0x302, 0x100, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, s = {0x302, 0x100, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}}, b = {u = {0x3, 0x2, 0x1, 0x0 <repeats 13 times>}, s = {0x3, 0x2, 0x1, 0x0 <repeats 13 times>}}, q = {u = {0x3020100000000000000000000000000}, s = { 0x3020100000000000000000000000000}}} The above snippet exemplifies an issue with how the SIMD register value is set when the target endianness differs from the host endianness. A similar issue is evident when setting SVE registers, as is shown by the snippet below where the QEMU host is little endian and the target is big endian: (gdb) p/x $z0 $1 = {q = {u = {0x0 <repeats 16 times>}, s = {0x0 <repeats 16 times>}}, d = {f = {0x0 <repeats 32 times>}, u = {0x0 <repeats 32 times>}, s = {0x0 <repeats 32 times>}}, s = {f = {0x0 <repeats 64 times>}, u = {0x0 <repeats 64 times>}, s = {0x0 <repeats 64 times>}}, h = {f = {0x0 <repeats 128 times>}, u = {0x0 <repeats 128 times>}, s = {0x0 <repeats 128 times>}}, b = {u = {0x0 <repeats 256 times>}, s = {0x0 <repeats 256 times>}}} (gdb) set $z0.q.u[0] = 0x010203 (gdb) p/x $z0 $2 = {q = {u = {0x302010000000000, 0x0 <repeats 15 times>}, s = {0x302010000000000, 0x0 <repeats 15 times>}}, d = {f = {0x0, 0x302010000000000, 0x0 <repeats 30 times>}, u = {0x0, 0x302010000000000, 0x0 <repeats 30 times>}, s = {0x0, 0x302010000000000, 0x0 <repeats 30 times>}}, s = {f = {0x0, 0x0, 0x3020100, 0x0 <repeats 61 times>}, u = {0x0, 0x0, 0x3020100, 0x0 <repeats 61 times>}, s = {0x0, 0x0, 0x3020100, 0x0 <repeats 61 times>}}, h = {f = {0x0, 0x0, 0x0, 0x0, 0x302, 0x100, 0x0 <repeats 122 times>}, u = {0x0, 0x0, 0x0, 0x0, 0x302, 0x100, 0x0 <repeats 122 times>}, s = {0x0, 0x0, 0x0, 0x0, 0x302, 0x100, 0x0 <repeats 122 times>}}, b = {u = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x2, 0x1, 0x0 <repeats 245 times>}, s = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x2, 0x1, 0x0 <repeats 245 times> }}} Note, in the case of SVE, this issue is also present when the host and target are both little endian. Consider the GDB remote session snippet below showcasing this: (gdb) p/x $z0 $6 = {q = {u = {0x0 <repeats 16 times>}, s = {0x0 <repeats 16 times>}}, d = {f = {0x0 <repeats 32 times>}, u = {0x0 <repeats 32 times>}, s = { 0x0 <repeats 32 times>}}, s = {f = {0x0 <repeats 64 times>}, u = { 0x0 <repeats 64 times>}, s = {0x0 <repeats 64 times>}}, h = {f = { 0x0 <repeats 128 times>}, u = {0x0 <repeats 128 times>}, s = { 0x0 <repeats 128 times>}}, b = {u = {0x0 <repeats 256 times>}, s = { 0x0 <repeats 256 times>}}} (gdb) set $z0.q.u[0] = 0x010203 (gdb) p/x $z0 $7 = {q = {u = {0x102030000000000000000, 0x0 <repeats 15 times>}, s = { 0x102030000000000000000, 0x0 <repeats 15 times>}}, d = {f = {0x0, 0x10203, 0x0 <repeats 30 times>}, u = {0x0, 0x10203, 0x0 <repeats 30 times>}, s = {0x0, 0x10203, 0x0 <repeats 30 times>}}, s = {f = {0x0, 0x0, 0x10203, 0x0 <repeats 61 times>}, u = {0x0, 0x0, 0x10203, 0x0 <repeats 61 times>}, s = {0x0, 0x0, 0x10203, 0x0 <repeats 61 times>}}, h = {f = {0x0, 0x0, 0x0, 0x0, 0x203, 0x1, 0x0 <repeats 122 times>}, u = {0x0, 0x0, 0x0, 0x0, 0x203, 0x1, 0x0 <repeats 122 times>}, s = {0x0, 0x0, 0x0, 0x0, 0x203, 0x1, 0x0 <repeats 122 times>}}, b = {u = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x2, 0x1, 0x0 <repeats 245 times>}, s = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3, 0x2, 0x1, 0x0 <repeats 245 times>}}} In all scenarios, the value returning on getting the register after setting it to 0x010203 is not preserved in appropriate byte order and hence does not print 0x010203 as expected. The current implementation for the SIMD functionality for getting and setting registers via the gdbstub is implemented as follows: aarch64_gdb_set_fpu_reg: <omitted code> uint64_t *q = aa64_vfp_qreg(env, reg); q[0] = ldq_le_p(buf); q[1] = ldq_le_p(buf + 8); return 16; <omitted code> The following code is a suggested fix for the current implementation that should allow for mixed endian support for getting/setting SIMD registers via the remote GDB protocol. aarch64_gdb_set_fpu_reg: <omitted code> // case 0...31 uint64_t *q = aa64_vfp_qreg(env, reg); if (target_big_endian()){ q[1] = ldq_p(buf); q[0] = ldq_p(buf + 8); } else{ q[0] = ldq_p(buf); q[1] = ldq_p(buf); } return 16; <omitted code> This use of ldq_p rather than ldq_le_p (which the current implementation uses) to load bytes into host endian struct is inspired by the current implementation for getting/setting general purpose registers via remote GDB (which works appropriately regardless of target endianness), as well as the current implementation for getting/setting gprs via GDB with ppc as a target (refer to ppc_cpu_gdb_write_register() for example). Note the the order of setting q[0] and q[1] is suggested to be swapped for big endian targets to ensure that q[1] always holds the most significant half and q[0] always holds the least significant half (refer to the comment in target/arm/cpu.h, line 155). For SVE, the current implementation is as follows for the zregs: aarch64_gdb_set_sve_reg: <omitted code> // case 0...31 int vq, len = 0; uint64_t *p = (uint64_t *) buf; for (vq = 0; vq < cpu->sve_max_vq; vq++) { env->vfp.zregs[reg].d[vq * 2 + 1] = *p++; env->vfp.zregs[reg].d[vq * 2] = *p++; len += 16; } return len; The suggestion here is similar to the one above for SIMD, that ldq_p should be used rather than simple pointer dereferencing. This suggestion aims to allow the QEMU gdbstub to support getting/setting register values correctly regardless of the target endianness. This suggestion aims to yield results such as the following from a remote GDB session, regardless of target endianness: (gdb) p/x $z0 $1 = {q = {u = {0x0 <repeats 16 times>}, s = {0x0 <repeats 16 times>}}, d = {f = {0x0 <repeats 32 times>}, u = {0x0 <repeats 32 times>}, s = {0x0 <repeats 32 times>}}, s = {f = {0x0 <repeats 64 times>}, u = {0x0 <repeats 64 times>}, s = {0x0 <repeats 64 times>}}, h = {f = { 0x0 <repeats 128 times>}, u = {0x0 <repeats 128 times>}, s = { 0x0 <repeats 128 times>}}, b = {u = {0x0 <repeats 256 times>}, s = { 0x0 <repeats 256 times>}}} (gdb) set $z0.q.u[0] = 0x010203 (gdb) p/x $z0 $2 = {q = {u = {0x10203, 0x0 <repeats 15 times>}, s = {0x10203, 0x0 <repeats 15 times>}}, d = {f = {0x10203, 0x0 <repeats 31 times>},u = {0x10203, 0x0 <repeats 31 times>}, s = {0x10203, 0x0 <repeats 31 times>}}, s = {f = {0x10203, 0x0 <repeats 63 times>}, u = {0x10203, 0x0 <repeats 63 times>}, s = {0x10203, 0x0 <repeats 63 times>}}, h = {f = {0x203, 0x1, 0x0 <repeats 126 times>}, u = {0x203, 0x1, 0x0 <repeats 126 times>}, s = {0x203, 0x1, 0x0 <repeats 126 times>}}, b = {u = {0x3, 0x2, 0x1, 0x0 <repeats 253 times>}, s = {0x3, 0x2, 0x1, 0x0 <repeats 253 times>}}}