On 2015-07-14 20:20, Paolo Bonzini wrote: > > > On 14/07/2015 19:09, Aurelien Jarno wrote: > > On 2015-07-14 17:38, Leon Alrae wrote: > >> There seems to be an issue when trying to keep a pointer in bottom 32-bits > >> of a 64-bit floating point register. Load and store instructions accessing > >> this address for some reason use the whole 64-bit content of floating point > >> register rather than truncated 32-bit value. The following load uses > >> incorrect address which leads to a crash if upper 32 bits of $f0 isn't 0: > >> > >> 0x00400c60: mfc1 t8,$f0 > >> 0x00400c64: lw t9,0(t8) > >> > >> It can be reproduced with the following linux userland program when running > >> on a MIPS32 with CP0.Status.FR=1 (by default mips32r5-generic and > >> mips32r6-generic CPUs have this bit set in linux-user). > >> > >> int main(int argc, char *argv[]) > >> { > >> int tmp = 0x11111111; > >> /* Set f0 */ > >> __asm__ ("mtc1 %0, $f0\n" > >> "mthc1 %1, $f0\n" > >> : : "r" (&tmp), "r" (tmp)); > >> /* At this point $f0: w:76fff040 d:1111111176fff040 */ > >> __asm__ ("mfc1 $t8, $f0\n" > >> "lw $t9, 0($t8)\n"); /* <--- crash! */ > >> return 0; > >> } > >> > >> Running above program in normal (non-singlestep mode) leads to: > >> > >> Program received signal SIGSEGV, Segmentation fault. > >> 0x00005555559f6f37 in static_code_gen_buffer () > >> (gdb) x/i 0x00005555559f6f37 > >> => 0x5555559f6f37 <static_code_gen_buffer+78359>: mov > >> %gs:0x0(%rbp),%ebp > >> (gdb) info registers rbp > >> rbp 0x1111111176fff040 0x1111111176fff040 > >> > >> The program runs fine in singlestep mode, or with disabled TCG > >> optimizations. Also, I'm not able to reproduce it in system emulation. > > > > I am able to reproduce the problem, but for me disabling the > > optimizations doesn't help. That said the problem is just another issue > > with the "let's assume the target supports move between 32 and 64 bit > > registers". At some point we should add a paragraph to tcg/README, to > > define how handle 32 vs 64 bit registers and what the TCG targets should > > expect. We had to add special code to handle that for sparc > > (trunc_shr_i32 instruction), but also code to the optimizer to remember > > about "garbage" high bits. I am not sure someone has a global view about > > how all this code interacts. > > I certainly don't have a global view, so much that I didn't think at > all of the optimizer... Instead, it looks to me like a bug in the > register allocator. In particular this code in tcg_reg_alloc_mov:
That's exactly my point when I said that someone doesn't have a global view. I think the fact that we don't check for type when simplifying moves in the register allocator is intentional, the same way we simply transform the trunc op into a mov op (except on sparc). This is done because it's not needed for example on x86 and most architectures, given 32-bit instructions do not care about the high part of the registers. Basically size changing ops are trunc_i64_i32, ext_i32_i64 and extu_i32_i64. We can be conservative and implement all of them as real instructions in all TCG backends. In that case the mov op never has to deal with registers of different size (just like we enforce that at the TCG frotnend level), and the register allocator and the optimizer do not have to deal with this. However that's suboptimal on some architectures, that's why on x86 we decided to just replace the trunc_i64_i32 by a move. But if we do this simplification it should be done everywhere (in that case, including in the qemu_ld op). And DOCUMENTED somewhere, given different choices can be made for different backends. As for the optimizer, it's goal is to predict the value of the registers by constant folding. It should be seen as another CPU, with its own rules. For example TCG internally stores 32-bit constants as signed extended. The optimizer should follow the same convention. -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net