Dong, Eddie wrote:
> Avi Kivity wrote:
>   
>> Dong, Eddie wrote:
>>     
>>>> What about preemption:
>>>>
>>>> - vcpu executes lapic code in qemu process context
>>>>
>>>>         
>>> Don't understand. LAPIC is in kernel, how can qemu access?
>>> If you mean qemu is calling APIC KVM syscall, then it already
>>> disabled preemption & take kvm->lock.
>>>
>>>
>>>
>>>       
>> I meant qemu executes the KVM_VCPU_RUN ioctl.  kvm->lock does not
>> disable preemption (it is a mutex).
>>     
>
> Just noticed it is changed to mutex, but seems same here :-)
> If the process is switched to other task, it is OK since it won't access
> local
> APIC. Current VP access to APIC will take the mutex first (see below).
> Or you are talking other corner case?
>
>   

apic access from process context is protected by kvm->lock, but apic
access from hrtimer is not.  Consider this scenario:

- guest accesses apic
- apic code starts modifying apic data
<preemption>
- timer fires
- apic_timer_fn() corrupts apic data

(I'm not even sure preemption is required here)

I think that in Xen this can't happen because is is not preemptible and
timers are processed when exiting back to the guest.



>> Do we really take kvm->lock for local accesses?  That's a significant
>> problem, much more than the timer.
>>     
>
> Today all APIC/IOAPIC access comes from shadow page fault which already
> take kvm->lock. KVM_IRQ_LINE will take too. (just noticed the
> save/restore part
> missed this one, will add later if we agree here). PIC access comes from
> kernel_pio which takes the mutex too.
>
> Another missing place is vmx_intr_assist which needs to take the mutex
> too.
> Will add later.
>
>   

The apic can be protected by vcpu->mutex, platform-wide things (pic,
ioapic) should be protected by kvm->lock.  This will work if we move all
apic processing to process context like I proposed in a previous mail.


>> I meant in addition to timer migration (I really like the timer
>> migration part -- it's much more important than lock removal for
>> performance). kvm_vcpu_kick() is needed to wake up from halt, or if we
>> have races between the timer and task migration.
>>     
>
> :-)  Actually we have solved this issue in previous patch and this one
> naturally.
> In adding back APIC timer IRQ patch, we will wakeup the halt vCPU.
>
> In this patch since hrtimer always run in same pCPU with guest VP
> (when VP is active), each time when hrtime fires (comes from a hardware
> IRQ), 
> it already VM Exit to kernel (similar function with kvm_vcpu_kick but 
> no need to explicitly call it) and then we do IRQ injection at
> vmx_intr_assist time.
>   

Yes, the two solutions are very similar.  But I think mine protects
against a race:

- scheduler starts migrating vcpu from cpu 0 to cpu 1
- hrtimer fires on cpu 0, but apic_timer_fn not called yet
- vcpu on cpu 1 migrates the hrtimer
- vcpu enters guest mode on cpu 1
- cpu 0 calls apic_timer_fn

In this case, there will be no wakeup.  So I think you do need to call
kvm_vcpu_kick() which will usually do nothing.

We also need to make sure all the non atomic code in __apic_timer_fn()
is executed in process context (it can use the pending count to decide
how much to add).

So I think there are three separate issues here:
- hrtimer migration: it helps performance, but doesn't help locking
- changing __apic_timer_fn() to only do atomic operations, and do the
nonatomic operations in process context under vcpu->mutex
- remove the apic lock

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to