On 11/06/19 10:38, Philippe Mathieu-Daudé wrote:
> Cc'ing Paolo & Richard.
> 
> On 6/10/19 4:27 AM, Colin Xu wrote:
>> cc more.
>>
>> On 2019-06-10 10:19, Colin Xu wrote:
>>> QEMU tracks whether a vcpu is halted using CPUState::halted. E.g.,
>>> after initialization or reset, halted is 0 for the BSP (vcpu 0)
>>> and 1 for the APs (vcpu 1, 2, ...). A halted vcpu should not be
>>> handed to the hypervisor to run (e.g. hax_vcpu_run()).
>>>
>>> Under HAXM, Android Emulator sometimes boots into a "vcpu shutdown
>>> request" error while executing in SeaBIOS, with the HAXM driver
>>> logging a guest triple fault in vcpu 1, 2, ... at RIP 0x3. That is
>>> ultimately because the HAX accelerator asks HAXM to run those APs
>>> when they are still in the halted state.
>>>
>>> Normally, the vcpu thread for an AP will start by looping in
>>> qemu_wait_io_event(), until the BSP kicks it via a pair of IPIs
>>> (INIT followed by SIPI). But because the HAX accelerator does not
>>> honor cpu->halted, it allows the AP vcpu thread to proceed to
>>> hax_vcpu_run() as soon as it receives any kick, even if the kick
>>> does not come from the BSP. It turns out that emulator has a
>>> worker thread which periodically kicks every vcpu thread (possibly
>>> to collect CPU usage data), and if one of these kicks comes before
>>> those by the BSP, the AP will start execution from the wrong RIP,
>>> resulting in the aforementioned SMP boot failure.
>>>
>>> The solution is inspired by the KVM accelerator (credit to
>>> Chuanxiao Dong <chuanxiao.d...@intel.com> for the pointer):
>>>
>>> 1. Get rid of questionable logic that unconditionally resets
>>>     cpu->halted before hax_vcpu_run(). Instead, only reset it at the
>>>     right moments (there are only a few "unhalt" events).
>>> 2. Add a check for cpu->halted before hax_vcpu_run().
>>>
>>> Note that although the non-Unrestricted Guest (!ug_platform) code
>>> path also forcibly resets cpu->halted, it is left untouched,
>>> because only the UG code path supports SMP guests.
>>>
>>> The patch is first merged to android emulator with Change-Id:
>>> I9c5752cc737fd305d7eace1768ea12a07309d716
>>>
>>> Cc: Yu Ning <yu.n...@intel.com>
>>> Cc: Chuanxiao Dong <chuanxiao.d...@intel.com>
>>> Signed-off-by: Colin Xu <colin...@intel.com>
>>> ---
>>>   cpus.c                |  1 -
>>>   target/i386/hax-all.c | 36 ++++++++++++++++++++++++++++++++++--
>>>   2 files changed, 34 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/cpus.c b/cpus.c
>>> index ffc57119ca5e..c1a56cd9ab01 100644
>>> --- a/cpus.c
>>> +++ b/cpus.c
>>> @@ -1591,7 +1591,6 @@ static void *qemu_hax_cpu_thread_fn(void *arg)
>>>         cpu->thread_id = qemu_get_thread_id();
>>>       cpu->created = true;
>>> -    cpu->halted = 0;
>>>       current_cpu = cpu;
>>>         hax_init_vcpu(cpu);
>>> diff --git a/target/i386/hax-all.c b/target/i386/hax-all.c
>>> index 44b89c1d74ae..58a27b475ec8 100644
>>> --- a/target/i386/hax-all.c
>>> +++ b/target/i386/hax-all.c
>>> @@ -471,13 +471,35 @@ static int hax_vcpu_hax_exec(CPUArchState *env)
>>>           return 0;
>>>       }
>>>   -    cpu->halted = 0;
>>> -
>>>       if (cpu->interrupt_request & CPU_INTERRUPT_POLL) {
>>>           cpu->interrupt_request &= ~CPU_INTERRUPT_POLL;
>>>           apic_poll_irq(x86_cpu->apic_state);
>>>       }
>>>   +    /* After a vcpu is halted (either because it is an AP and has
>>> just been
>>> +     * reset, or because it has executed the HLT instruction), it
>>> will not be
>>> +     * run (hax_vcpu_run()) until it is unhalted. The next few if
>>> blocks check
>>> +     * for events that may change the halted state of this vcpu:
>>> +     *  a) Maskable interrupt, when RFLAGS.IF is 1;
>>> +     *     Note: env->eflags may not reflect the current RFLAGS
>>> state, because
>>> +     *           it is not updated after each hax_vcpu_run(). We
>>> cannot afford
>>> +     *           to fail to recognize any
>>> unhalt-by-maskable-interrupt event
>>> +     *           (in which case the vcpu will halt forever), and yet
>>> we cannot
>>> +     *           afford the overhead of hax_vcpu_sync_state(). The
>>> current
>>> +     *           solution is to err on the side of caution and have
>>> the HLT
>>> +     *           handler (see case HAX_EXIT_HLT below)
>>> unconditionally set the
>>> +     *           IF_MASK bit in env->eflags, which, in effect,
>>> disables the
>>> +     *           RFLAGS.IF check.
>>> +     *  b) NMI;
>>> +     *  c) INIT signal;
>>> +     *  d) SIPI signal.
>>> +     */
>>> +    if (((cpu->interrupt_request & CPU_INTERRUPT_HARD) &&
>>> +         (env->eflags & IF_MASK)) ||
>>> +        (cpu->interrupt_request & CPU_INTERRUPT_NMI)) {
>>> +        cpu->halted = 0;
>>> +    }
>>> +
>>>       if (cpu->interrupt_request & CPU_INTERRUPT_INIT) {
>>>           DPRINTF("\nhax_vcpu_hax_exec: handling INIT for %d\n",
>>>                   cpu->cpu_index);
>>> @@ -493,6 +515,16 @@ static int hax_vcpu_hax_exec(CPUArchState *env)
>>>           hax_vcpu_sync_state(env, 1);
>>>       }
>>>   +    if (cpu->halted) {
>>> +        /* If this vcpu is halted, we must not ask HAXM to run it.
>>> Instead, we
>>> +         * break out of hax_smp_cpu_exec() as if this vcpu had
>>> executed HLT.
>>> +         * That way, this vcpu thread will be trapped in
>>> qemu_wait_io_event(),
>>> +         * until the vcpu is unhalted.
>>> +         */
>>> +        cpu->exception_index = EXCP_HLT;
>>> +        return 0;
>>> +    }
>>> +
>>>       do {
>>>           int hax_ret;
>>>   
>>

Queued, thanks.

Paolo

Reply via email to