On 11/06/19 10:38, Philippe Mathieu-Daudé wrote: > Cc'ing Paolo & Richard. > > On 6/10/19 4:27 AM, Colin Xu wrote: >> cc more. >> >> On 2019-06-10 10:19, Colin Xu wrote: >>> QEMU tracks whether a vcpu is halted using CPUState::halted. E.g., >>> after initialization or reset, halted is 0 for the BSP (vcpu 0) >>> and 1 for the APs (vcpu 1, 2, ...). A halted vcpu should not be >>> handed to the hypervisor to run (e.g. hax_vcpu_run()). >>> >>> Under HAXM, Android Emulator sometimes boots into a "vcpu shutdown >>> request" error while executing in SeaBIOS, with the HAXM driver >>> logging a guest triple fault in vcpu 1, 2, ... at RIP 0x3. That is >>> ultimately because the HAX accelerator asks HAXM to run those APs >>> when they are still in the halted state. >>> >>> Normally, the vcpu thread for an AP will start by looping in >>> qemu_wait_io_event(), until the BSP kicks it via a pair of IPIs >>> (INIT followed by SIPI). But because the HAX accelerator does not >>> honor cpu->halted, it allows the AP vcpu thread to proceed to >>> hax_vcpu_run() as soon as it receives any kick, even if the kick >>> does not come from the BSP. It turns out that emulator has a >>> worker thread which periodically kicks every vcpu thread (possibly >>> to collect CPU usage data), and if one of these kicks comes before >>> those by the BSP, the AP will start execution from the wrong RIP, >>> resulting in the aforementioned SMP boot failure. >>> >>> The solution is inspired by the KVM accelerator (credit to >>> Chuanxiao Dong <chuanxiao.d...@intel.com> for the pointer): >>> >>> 1. Get rid of questionable logic that unconditionally resets >>> cpu->halted before hax_vcpu_run(). Instead, only reset it at the >>> right moments (there are only a few "unhalt" events). >>> 2. Add a check for cpu->halted before hax_vcpu_run(). >>> >>> Note that although the non-Unrestricted Guest (!ug_platform) code >>> path also forcibly resets cpu->halted, it is left untouched, >>> because only the UG code path supports SMP guests. >>> >>> The patch is first merged to android emulator with Change-Id: >>> I9c5752cc737fd305d7eace1768ea12a07309d716 >>> >>> Cc: Yu Ning <yu.n...@intel.com> >>> Cc: Chuanxiao Dong <chuanxiao.d...@intel.com> >>> Signed-off-by: Colin Xu <colin...@intel.com> >>> --- >>> cpus.c | 1 - >>> target/i386/hax-all.c | 36 ++++++++++++++++++++++++++++++++++-- >>> 2 files changed, 34 insertions(+), 3 deletions(-) >>> >>> diff --git a/cpus.c b/cpus.c >>> index ffc57119ca5e..c1a56cd9ab01 100644 >>> --- a/cpus.c >>> +++ b/cpus.c >>> @@ -1591,7 +1591,6 @@ static void *qemu_hax_cpu_thread_fn(void *arg) >>> cpu->thread_id = qemu_get_thread_id(); >>> cpu->created = true; >>> - cpu->halted = 0; >>> current_cpu = cpu; >>> hax_init_vcpu(cpu); >>> diff --git a/target/i386/hax-all.c b/target/i386/hax-all.c >>> index 44b89c1d74ae..58a27b475ec8 100644 >>> --- a/target/i386/hax-all.c >>> +++ b/target/i386/hax-all.c >>> @@ -471,13 +471,35 @@ static int hax_vcpu_hax_exec(CPUArchState *env) >>> return 0; >>> } >>> - cpu->halted = 0; >>> - >>> if (cpu->interrupt_request & CPU_INTERRUPT_POLL) { >>> cpu->interrupt_request &= ~CPU_INTERRUPT_POLL; >>> apic_poll_irq(x86_cpu->apic_state); >>> } >>> + /* After a vcpu is halted (either because it is an AP and has >>> just been >>> + * reset, or because it has executed the HLT instruction), it >>> will not be >>> + * run (hax_vcpu_run()) until it is unhalted. The next few if >>> blocks check >>> + * for events that may change the halted state of this vcpu: >>> + * a) Maskable interrupt, when RFLAGS.IF is 1; >>> + * Note: env->eflags may not reflect the current RFLAGS >>> state, because >>> + * it is not updated after each hax_vcpu_run(). We >>> cannot afford >>> + * to fail to recognize any >>> unhalt-by-maskable-interrupt event >>> + * (in which case the vcpu will halt forever), and yet >>> we cannot >>> + * afford the overhead of hax_vcpu_sync_state(). The >>> current >>> + * solution is to err on the side of caution and have >>> the HLT >>> + * handler (see case HAX_EXIT_HLT below) >>> unconditionally set the >>> + * IF_MASK bit in env->eflags, which, in effect, >>> disables the >>> + * RFLAGS.IF check. >>> + * b) NMI; >>> + * c) INIT signal; >>> + * d) SIPI signal. >>> + */ >>> + if (((cpu->interrupt_request & CPU_INTERRUPT_HARD) && >>> + (env->eflags & IF_MASK)) || >>> + (cpu->interrupt_request & CPU_INTERRUPT_NMI)) { >>> + cpu->halted = 0; >>> + } >>> + >>> if (cpu->interrupt_request & CPU_INTERRUPT_INIT) { >>> DPRINTF("\nhax_vcpu_hax_exec: handling INIT for %d\n", >>> cpu->cpu_index); >>> @@ -493,6 +515,16 @@ static int hax_vcpu_hax_exec(CPUArchState *env) >>> hax_vcpu_sync_state(env, 1); >>> } >>> + if (cpu->halted) { >>> + /* If this vcpu is halted, we must not ask HAXM to run it. >>> Instead, we >>> + * break out of hax_smp_cpu_exec() as if this vcpu had >>> executed HLT. >>> + * That way, this vcpu thread will be trapped in >>> qemu_wait_io_event(), >>> + * until the vcpu is unhalted. >>> + */ >>> + cpu->exception_index = EXCP_HLT; >>> + return 0; >>> + } >>> + >>> do { >>> int hax_ret; >>> >>
Queued, thanks. Paolo