On Tue, Feb 01, 2011 at 02:58:02PM +0100, Jan Kiszka wrote: > On 2011-02-01 14:48, Marcelo Tosatti wrote: > > On Tue, Feb 01, 2011 at 02:32:38PM +0100, Jan Kiszka wrote: > >> On 2011-02-01 13:47, Marcelo Tosatti wrote: > >>> On Thu, Jan 27, 2011 at 02:09:58PM +0100, Jan Kiszka wrote: > >>>> Found by Stefan Hajnoczi: There is a race in kvm_cpu_exec between > >>>> checking for exit_request on vcpu entry and timer signals arriving > >>>> before KVM starts to catch them. Plug it by blocking both timer related > >>>> signals also on !CONFIG_IOTHREAD and process those via signalfd. > >>>> > >>>> Signed-off-by: Jan Kiszka <jan.kis...@siemens.com> > >>>> CC: Stefan Hajnoczi <stefa...@linux.vnet.ibm.com> > >>>> --- > >>>> cpus.c | 18 ++++++++++++++++++ > >>>> 1 files changed, 18 insertions(+), 0 deletions(-) > >>>> > >>>> diff --git a/cpus.c b/cpus.c > >>>> index fc3f222..29b1070 100644 > >>>> --- a/cpus.c > >>>> +++ b/cpus.c > >>>> @@ -254,6 +254,10 @@ static void qemu_kvm_init_cpu_signals(CPUState *env) > >>>> pthread_sigmask(SIG_BLOCK, NULL, &set); > >>>> sigdelset(&set, SIG_IPI); > >>>> sigdelset(&set, SIGBUS); > >>>> +#ifndef CONFIG_IOTHREAD > >>>> + sigdelset(&set, SIGIO); > >>>> + sigdelset(&set, SIGALRM); > >>>> +#endif > >>> > >>> I'd prefer separate qemu_kvm_init_cpu_signals in the !IOTHREAD > >>> section. > >> > >> You mean to duplicate qemu_kvm_init_cpu_signals for both configurations? > > > > Yes, so to avoid #ifdefs spread. > > Would exchange some #ifdefs against ifndef _WIN32. Haven't measured the > delta though. > > > > >>>> + > >>>> +#ifndef CONFIG_IOTHREAD > >>>> + if (sigismember(&chkset, SIGIO) || sigismember(&chkset, SIGALRM)) { > >>>> + qemu_notify_event(); > >>>> + } > >>>> +#endif > >>> > >>> Why is this necessary? > >>> > >>> You should break out of cpu_exec_all if there's a pending alarm (see > >>> qemu_alarm_pending()). > >> > >> qemu_alarm_pending() is not true until the signal is actually taken. The > >> alarm handler sets the required flags. > > > > Right. What i mean is you need to execute the signal handler inside > > cpu_exec_all loop (so that alarm pending is set). > > > > So, if there is a SIGALRM pending, qemu_run_timers has highest > > priority, not vcpu execution. > > We leave the vcpu loop (thanks to notify_event), process the signal in > the event loop and run the timer handler. This pattern is IMO less > invasive to the existing code, specifically as it is about to die > long-term anyway.
You'll probably see poor timer behaviour on smp guests without iothread enabled.