On Wed, 2008-03-26 at 15:14 -0300, Marcelo Tosatti wrote:
> On Wed, Mar 26, 2008 at 06:57:04PM +0200, Avi Kivity wrote:
> > Marcelo Tosatti wrote:
> > >
> > >>> QEMU/KVM: separate thread for IO handling
> > >>>
> > >>> Move IO processing from vcpu0 to a dedicated thread.
> > >>>
> > >>> This removes load on vcpu0 by allowing better cache locality and also
> > >>> improves latency.
> > >>>
> > >>> We can now block signal handling for IO events, so sigtimedwait won't
> > >>> race with handlers:
> > >>>
> > >>> - Currently the SIGALRM handler fails to set CPU_INTERRUPT_EXIT because
> > >>> the "next_cpu" variable is not initialized in the KVM path, meaning that
> > >>> processing of timer expiration might be delayed until the next vcpu0 
> > >>> exit.
> > >>>  
> > >>>       
> > >> I think we call main_loop_wait() is called unconditionally after every 
> > >> signal.
> > >>     
> > >
> > > We exit the kvm_run() loop if CPU_INTERRUPT_EXIT is detected by 
> > > pre_kvm_run().
> > >
> > >   
> > 
> > But why do we need to exit the kvm_run() loop?  As I understand it, the 
> > I/O thread wakes up when the signal is queued and calls main_loop_wait() 
> > to process any events (through qemu_run_timers()).  If a timer needs to 
> > wake up a vcpu, it will raise an interrupt line which will wake the vcpu 
> > up, either in the kernel or in userspace depending on -no-kvm-irqchip.
> 
> In the current state of vcpu0 thread handling IO, kvm_run() loop must
> bail out for main_loop_wait->qemu_run_timers() to run.
> 
> If using an userspace timer such as RTC (brought to attention by Dor's
> patches), the following will happen:
> 
>     - signal wakes up vcpu0 thread, goes back to userspace.
>     - host_alarm_handler runs but fails to set CPU_INTERRUPT_EXIT 
>       because "next_cpu" is not initialized.
>     - pre_kvm_run() checks for CPU_INTERRUPT_EXIT and determines
>       its not necessary to exit kvm_run(), so vcpu0 thread goes
>       back into kernel to enter guest mode.
> 
> No interrupt was raised even though SIGALRM handler has executed.
> 

But what if the host_alarm_handler to wake up the sleeping IO thread,
then it will run the timers and the irq will be triggered using ioctl.
You're right that if the designated vcpu was in userspace at that time
it might go back into the kernel but the time to do is decreases + the
following:

Every vcpu reads it's irq vector every time before vcpu_run so the guest
will receive the irq the next vmenter.
The lapic code sends kvm_vcpu_kick to the destination vcpu in order to
cause a vmexit so interrupt would be injected next time.

> AFAICS next_cpu is only initialized here:
> 
> static int main_loop(void)
> {
> ...
>     if (kvm_enabled()) {
>         kvm_main_loop();
>         cpu_disable_ticks();
>         return 0;
>     }
> 
>     cur_cpu = first_cpu;
>     next_cpu = cur_cpu->next_cpu ?: first_cpu;
>     for(;;) {
> 
> See ? I pointed this out as it appears to be another factor in
> unreliable userspace timers.
> 


-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel

Reply via email to