[Rehashing a relatively old thread. It is available, in particular, at http://thread.gmane.org/gmane.comp.emulators.qemu/196629]
22.02.2013 22:09, Peter Maydell wrote: > This patch series gets rid of cpu_unlink_tb(), which is irredeemably > racy, since it modifies the TB graph with no locking from other > threads, signal handlers, etc etc. (The signal handler case is > why you can't just fix this with more locks.) Instead we take the > much simpler approach of setting a flag for the CPU when we want > it to stop executing TBs, and generate code to check the flag at > the start of every TB. The raciness is easiest to provoke with > multithreaded linux-user guests but it is I think also a risk > in system emulation mode. > > This fixes the crashes seen in LP:668799; however there are another > class of crashes described in LP:1098729 which stem from the fact > that in linux-user with a multithreaded guest all threads will > use and modify the same global TCG date structures (including the > generated code buffer) without any kind of locking. This means that > multithreaded guest binaries are still in the "unsupported" category. Now when Debian Wheezy has been released with qemu 1.2, users started to file bugreports about multithreaded apps crashing. So, while I understand these are "unsupported" as per above, I think it is better to fix as much as we can, and allow some more usage cases, than to describe that it "does not work". (And yes, I also understand it's better to fix that before a release, but oh well). So I tried to backport the series to 1.2 branch, to see how it goes. But there were many changes since 1.2, so it ended up in quite some changes, -- not exactly huge and complicated, but I need some help or additional pair (or two) of eyes to ensure the sanity of the resulting code. The backported patchset is smaller than the original. > Andreas Färber (1): > cpu: Introduce ENV_OFFSET macros This change isn't needed in 1.2, because all CPU state is in single place there, it hasn't been split between env and cpu states there. > Peter Maydell (5): > tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses Just a minor context difference in the header, no questions. > cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC This needed a "back merge" of env+cpu states back to cpu. Maybe we should somehow revisit the whole concept of the two, because it's sorta fun: at some point all functions has been converted to accept `cpu' instead of `env', but in many places the first thing a function does is to get `env' pointer out of `cpu'. The backport of this change reverts this piggy-backing and uses `env' everywhere consistently again. > Handle CPU interrupts by inline checking of a flag The main patch in the series. The same simplification from `cpu' back to `env' as above. I had to add `tcg_exit_req' field into CPU_COMMON macro in cpu-defs.h, instead of adding it to CPUState struct in include/qom/cpu.h. Plus the corresponding changes in gen-icount.h -- that's where ENV_OFFSET was used, but due to the same env to cpu split, not needed anymore. My main question is actually about this place, I don't exactly understand how the code generation works, so am not sure if I backported it correctly. Plus minor code shuffling - for one, bits in translate-all.c was in exec.c before, so I had to modify it there. > translate-all.c: Remove cpu_unlink_tb() That's easy, but again the bits being removed are in exec.c in 1.2, not in translate-all.c (so the resulting backported patch is now misnamed). Technically this patch isn't needed for a backport, since the function(s) are really unused, but gcc generates a warning about unused static function and the compilation fails due to -Werror. > gen-icount.h: Rename gen_icount_start/end to gen_tb_start/end And I didn't try to backport this one, since this is just mechanical s/icount/tb/g without any code changes. Now, the resulting thing compiles (ha!), but I'm not really sure how to test it. I ran a few random apps using qemu-i386 and qemu-arm, it appears to work. If all goes well, I think this series needs to be included in whatever -stable qemu series are in use. I'll post the 4 resulting patches in reply to this message. Thank you! /mjt