Re: [Qemu-devel] Holding the BQL for emulate_ppc_hypercall

Alex Bennée Tue, 25 Oct 2016 01:38:32 -0700

David Gibson <da...@gibson.dropbear.id.au> writes:

> On Mon, Oct 24, 2016 at 03:44:01PM +0100, Alex Bennée wrote:
>>
>> Alex Bennée <alex.ben...@linaro.org> writes:
>>
>> > Hi,
>> >
>> > In the MTTCG patch set one of the big patches is to remove the
>> > requirement to hold the BQL while running code:
>> >
>> >   tcg: drop global lock during TCG code execution
>> >
>> > And this broke the PPC code because emulate_ppc_hypercall can cause
>> > changes to the global state. This function just calls spapr_hypercall()
>> > and puts the results into the TCG register file. Normally
>> > spapr_hypercall() is called under the BQL in KVM as
>> > kvm_arch_handle_exit() does things with the BQL held.
>> >
>> > I blithely wrapped the called in a lock/unlock pair only to find the
>> > ppc64 check builds failed as the hypercall was made during the
>> > cc->do_interrupt() code which also holds the BQL.
>> >
>> > I'm a little confused by the nature of PPC hypercalls in TCG? Are they
>> > not all detectable at code generation time? What is the case that causes
>> > an exception to occur rather than the helper function doing the
>> > hypercall?
>> >
>> > I guess it comes down to can I avoid doing:
>> >
>> >   /* If we come via cc->do_interrupt BQL may already be held */
>> >   if (!qemu_mutex_iothread_locked()) {
>> >       g_mutex_lock_iothread();
>> >       env->gpr[3] = spapr_hypercall(cpu, env->gpr[3], &env->gpr[4]);
>> >       g_muetx_unlock_iothread();
>> >   } else {
>> >       env->gpr[3] = spapr_hypercall(cpu, env->gpr[3], &env->gpr[4]);
>> >   }
>>
>> Of course I mean:
>>
>>   /* If we come via cc->do_interrupt BQL may already be held */
>>   if (!qemu_mutex_iothread_locked()) {
>>       qemu_mutex_lock_iothread();
>>       env->gpr[3] = spapr_hypercall(cpu, env->gpr[3], &env->gpr[4]);
>>       qemu_mutex_unlock_iothread();
>>   } else {
>>       env->gpr[3] = spapr_hypercall(cpu, env->gpr[3], &env->gpr[4]);
>>   }
>>
>> > Any thoughts?
>
> So, I understand why the hypercall is being called from exception code
> and therefore with the BQL held.  On Power, the hypercall instruction
> is the same as the guest-level system call instruction, just with a
> flag bit set.  System calls are, of course, treated as exceptions,
> because they change the CPU's privilege mode.  Likewise if we were
> implementing a full host system (like the upcoming 'powernv' machine
> type) we'd need to treat hypercalls as exceptions for the same reason.
>
> We could detect hypercalls at translation time, but at present we
> don't: we go into the exception path, then detect that it's a "level
> 1" (i.e. hypervisor) sc instruction and branch off to the hypercall
> emulation code if that's been set up.  It just seemed the simplet
> approach at the time.
>
> What I *don't* understand is how the hypercall code is ever being
> invoked *without* the BQL.  I grepped through and the only entry paths
> I can see are the one in the exception handling and KVM.


Ahh I think I have figured out what happened.

In the pre-MTTCG world TCG code always held the BQL. After Jan's patch
the BQL was held for interrupt processing but not exception handling.
I've since fixed this. I think I must have tried this hack before fixing
the root cause because I had another assertion failure when trying to
ensure all IRQ processing has the BQL held (as the target specific code
can do all sorts of things on an IRQ/Exception).

> Could you try to get a backtrace from the case where we're entering
> the hypercall without the BQL?

I'll remove the hack, add an assert and see if it comes up again in
testing.

--
Alex Bennée

Re: [Qemu-devel] Holding the BQL for emulate_ppc_hypercall

Reply via email to