Dong, Eddie wrote: > Avi Kivity wrote: > >> Dong, Eddie wrote: >> >>>> What about preemption: >>>> >>>> - vcpu executes lapic code in qemu process context >>>> >>>> >>> Don't understand. LAPIC is in kernel, how can qemu access? >>> If you mean qemu is calling APIC KVM syscall, then it already >>> disabled preemption & take kvm->lock. >>> >>> >>> >>> >> I meant qemu executes the KVM_VCPU_RUN ioctl. kvm->lock does not >> disable preemption (it is a mutex). >> > > Just noticed it is changed to mutex, but seems same here :-) > If the process is switched to other task, it is OK since it won't access > local > APIC. Current VP access to APIC will take the mutex first (see below). > Or you are talking other corner case? > >
apic access from process context is protected by kvm->lock, but apic access from hrtimer is not. Consider this scenario: - guest accesses apic - apic code starts modifying apic data <preemption> - timer fires - apic_timer_fn() corrupts apic data (I'm not even sure preemption is required here) I think that in Xen this can't happen because is is not preemptible and timers are processed when exiting back to the guest. >> Do we really take kvm->lock for local accesses? That's a significant >> problem, much more than the timer. >> > > Today all APIC/IOAPIC access comes from shadow page fault which already > take kvm->lock. KVM_IRQ_LINE will take too. (just noticed the > save/restore part > missed this one, will add later if we agree here). PIC access comes from > kernel_pio which takes the mutex too. > > Another missing place is vmx_intr_assist which needs to take the mutex > too. > Will add later. > > The apic can be protected by vcpu->mutex, platform-wide things (pic, ioapic) should be protected by kvm->lock. This will work if we move all apic processing to process context like I proposed in a previous mail. >> I meant in addition to timer migration (I really like the timer >> migration part -- it's much more important than lock removal for >> performance). kvm_vcpu_kick() is needed to wake up from halt, or if we >> have races between the timer and task migration. >> > > :-) Actually we have solved this issue in previous patch and this one > naturally. > In adding back APIC timer IRQ patch, we will wakeup the halt vCPU. > > In this patch since hrtimer always run in same pCPU with guest VP > (when VP is active), each time when hrtime fires (comes from a hardware > IRQ), > it already VM Exit to kernel (similar function with kvm_vcpu_kick but > no need to explicitly call it) and then we do IRQ injection at > vmx_intr_assist time. > Yes, the two solutions are very similar. But I think mine protects against a race: - scheduler starts migrating vcpu from cpu 0 to cpu 1 - hrtimer fires on cpu 0, but apic_timer_fn not called yet - vcpu on cpu 1 migrates the hrtimer - vcpu enters guest mode on cpu 1 - cpu 0 calls apic_timer_fn In this case, there will be no wakeup. So I think you do need to call kvm_vcpu_kick() which will usually do nothing. We also need to make sure all the non atomic code in __apic_timer_fn() is executed in process context (it can use the pending count to decide how much to add). So I think there are three separate issues here: - hrtimer migration: it helps performance, but doesn't help locking - changing __apic_timer_fn() to only do atomic operations, and do the nonatomic operations in process context under vcpu->mutex - remove the apic lock -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel