Re: [kvm-devel] [PATCH] Fix QEMU vcpu thread race with apic_reset
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > @@ -369,6 +372,10 @@ static void *ap_main_loop(void *_env) > sigfillset(&signals); > sigprocmask(SIG_BLOCK, &signals, NULL); > kvm_create_vcpu(kvm_context, env->cpu_index); > +pthread_mutex_lock(&vcpu_mutex); > +vcpu->created = 1; > +pthread_cond_signal(&qemu_vcpuup_cond); > +pthread_mutex_unlock(&vcpu_mutex); > kvm_qemu_init_env(env); > kvm_main_loop_cpu(env); > return NULL; Still no locking needed on any CPU we support. The memory access is atomic and that's all that counts here. With the mutex taken the woken thread immediately runs into a brick wall and has to be put to sleep again. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFIFm/w2ijCOnn/RHQRAmxRAKDHoem5zvdt6gSVdoX6vQoYPr106QCcC3IA RzKcqRUNgUmUDzqnrkjHrAI= =MNbq -END PGP SIGNATURE- - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] Fix QEMU vcpu thread race with apic_reset
Ryan Harper wrote: > * Avi Kivity <[EMAIL PROTECTED]> [2008-04-26 02:23]: > >> Please reuse qemu_mutex for this, no need for a new one. >> > > I'm having a little trouble wrapping my head around all of the locking > here. If I avoid qemu_mutex and use a new one, I've got everything > working. However, attemping to use qemu_mutex is stepping into a pile > of locking crap. I'm sure there is a good reason... > > The current code looks like this: > > Thread1: > main() > kvm_qemu_init() // mutex_lock() > machine->init() >pc_init1() > pc_new_cpu() > cpu_init() > cpu_x86_init() >kvm_init_new_ap() // create vcpu Thread2 ><-- > <-- > <-- > <-- ><-- > <-- > kvm_main_loop() // mutex_unlock() > > Thread2: > ap_main_loop() > /* vcpu init */ > kvm_main_loop_cpu() > kvm_main_loop_wait() // mutex_unlock() on enter, lock on exit >kvm_eat_signals() // mutex_lock() on enter, unlock on exit ><-- > <-- > <-- > The qemu_mutex is meant to ensure that the QEMU code is only ever entered by one thread at a time. QEMU is not thread-safe so this is necessary. It's a little odd because we're taking this lock initially by being called from QEMU code. Here's the basic theory of locking AFAICT: kvm_main_loop_cpu() is the main loop for each VCPU. It must acquire the QEMU mutex since it will call into normal QEMU code to process events. Whenever a VCPU is allowed to run, or when the VCPU is idling, it needs to release the QEMU mutex. The former is done via the post_kvm_run() and pre_kvm_run() hooks. The later is done within kvm_main_loop_wait(). kvm_main_loop_wait() will release the QEMU mutex and call kvm_eat_signals() which calls kvm_eat_signal() which will issue a sigtimedwait(). This is where we actually idle (and why SIGIO is so important right now). We don't want to idle with the QEMU mutex held as that may result in dead lock so this is why we release it here. kvm_eat_signal() has to acquire the lock again in order to dispatch IO events (via kvm_process_signal()). Regards, Anthony Liguori > It wedges up in kvm_init_new_ap() if I attempt acquire qemu_mutex. > Quite obvious after I looked at the call trace and discovered > kvm_qemu_init() locking on exit. I see other various functions that > unlock and then lock; I really don't want to wade into this mess... > rather whomever cooked it up should do some cleanup. I tried the > unlock, then re-lock on exit in kvm_init_new_ap() but that also wedged. > > Here is a rework with a new flag in vcpu_info indicating vcpu > creation. Tested this with 64 1VCPU guests booting with 1 second delay, > and single 16-way SMP guest boot. > > - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] Fix QEMU vcpu thread race with apic_reset
* Avi Kivity <[EMAIL PROTECTED]> [2008-04-26 02:23]: > > Please reuse qemu_mutex for this, no need for a new one. I'm having a little trouble wrapping my head around all of the locking here. If I avoid qemu_mutex and use a new one, I've got everything working. However, attemping to use qemu_mutex is stepping into a pile of locking crap. I'm sure there is a good reason... The current code looks like this: Thread1: main() kvm_qemu_init() // mutex_lock() machine->init() pc_init1() pc_new_cpu() cpu_init() cpu_x86_init() kvm_init_new_ap() // create vcpu Thread2 <-- <-- <-- <-- <-- <-- kvm_main_loop() // mutex_unlock() Thread2: ap_main_loop() /* vcpu init */ kvm_main_loop_cpu() kvm_main_loop_wait() // mutex_unlock() on enter, lock on exit kvm_eat_signals() // mutex_lock() on enter, unlock on exit <-- <-- <-- It wedges up in kvm_init_new_ap() if I attempt acquire qemu_mutex. Quite obvious after I looked at the call trace and discovered kvm_qemu_init() locking on exit. I see other various functions that unlock and then lock; I really don't want to wade into this mess... rather whomever cooked it up should do some cleanup. I tried the unlock, then re-lock on exit in kvm_init_new_ap() but that also wedged. Here is a rework with a new flag in vcpu_info indicating vcpu creation. Tested this with 64 1VCPU guests booting with 1 second delay, and single 16-way SMP guest boot. -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 [EMAIL PROTECTED] diffstat output: qemu-kvm.c | 14 -- 1 files changed, 12 insertions(+), 2 deletions(-) Signed-off-by: Ryan Harper <[EMAIL PROTECTED]> --- diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c index 78127de..2768ea5 100644 --- a/qemu/qemu-kvm.c +++ b/qemu/qemu-kvm.c @@ -31,7 +31,9 @@ extern int smp_cpus; static int qemu_kvm_reset_requested; pthread_mutex_t qemu_mutex = PTHREAD_MUTEX_INITIALIZER; +pthread_mutex_t vcpu_mutex = PTHREAD_MUTEX_INITIALIZER; pthread_cond_t qemu_aio_cond = PTHREAD_COND_INITIALIZER; +pthread_cond_t qemu_vcpuup_cond = PTHREAD_COND_INITIALIZER; __thread struct vcpu_info *vcpu; struct qemu_kvm_signal_table { @@ -53,6 +55,7 @@ struct vcpu_info { int stop; int stopped; int reload_regs; +int created; } vcpu_info[256]; pthread_t io_thread; @@ -369,6 +372,10 @@ static void *ap_main_loop(void *_env) sigfillset(&signals); sigprocmask(SIG_BLOCK, &signals, NULL); kvm_create_vcpu(kvm_context, env->cpu_index); +pthread_mutex_lock(&vcpu_mutex); +vcpu->created = 1; +pthread_cond_signal(&qemu_vcpuup_cond); +pthread_mutex_unlock(&vcpu_mutex); kvm_qemu_init_env(env); kvm_main_loop_cpu(env); return NULL; @@ -388,9 +395,12 @@ static void kvm_add_signal(struct qemu_kvm_signal_table *sigtab, int signum) void kvm_init_new_ap(int cpu, CPUState *env) { +pthread_mutex_lock(&vcpu_mutex); pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env); -/* FIXME: wait for thread to spin up */ -usleep(200); +while (vcpu_info[cpu].created == 0) { +pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex); +} +pthread_mutex_unlock(&vcpu_mutex); } static void qemu_kvm_init_signal_tables(void) - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] Fix QEMU vcpu thread race with apic_reset
Ulrich Drepper wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Ryan Harper wrote: > >> @@ -388,9 +395,10 @@ static void kvm_add_signal(struct qemu_kvm_signal_table >> *sigtab, int signum) >> >> void kvm_init_new_ap(int cpu, CPUState *env) >> { >> +pthread_mutex_lock(&vcpu_mutex); >> pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env); >> -/* FIXME: wait for thread to spin up */ >> -usleep(200); >> +pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex); >> +pthread_mutex_unlock(&vcpu_mutex); >> > > And something is very wrong here. The pattern for using a condvar is > > 1 take mutex > > 2 check condition > > 3 if condition is not fulfilled > 3a call cond_wait > 3b when returning, go back to step 2 > > 4 unlock mutex > > > Anything else is buggy. > > So, either your condvar use is wrong or you don't really want a condvar > in the first place. I haven't checked the code. > A flag is needed in the vcpu structure that indicates whether the vcpu spun up or not. This is what the while () condition should be. This is needed as the thread may spin up before it gets to the pthread_cond_wait() in which case the signal happens when noone is waiting on it. The other reason a while () is usually needed is that cond_signal may not wake up the right thread so it's necessary to check whether you really have something to do. Not really a problem here but the former race is. A condvar is definitely the right thing to use here. Regards, Anthony Liguori > - -- > ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.7 (GNU/Linux) > > iD8DBQFIE2CD2ijCOnn/RHQRAs4kAJ40kbWjNJAzj2gGdbo/sSxZTx5b0ACglbis > kw7ST4eJK9CXhNbjKphNsUo= > =ISaC > -END PGP SIGNATURE- > > - > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference > Don't miss this year's exciting event. There's still time to save $100. > Use priority code J8TL2D2. > http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone > ___ > kvm-devel mailing list > kvm-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/kvm-devel > - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] Fix QEMU vcpu thread race with apic_reset
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ryan Harper wrote: > @@ -388,9 +395,10 @@ static void kvm_add_signal(struct qemu_kvm_signal_table > *sigtab, int signum) > > void kvm_init_new_ap(int cpu, CPUState *env) > { > +pthread_mutex_lock(&vcpu_mutex); > pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env); > -/* FIXME: wait for thread to spin up */ > -usleep(200); > +pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex); > +pthread_mutex_unlock(&vcpu_mutex); And something is very wrong here. The pattern for using a condvar is 1 take mutex 2 check condition 3 if condition is not fulfilled 3a call cond_wait 3b when returning, go back to step 2 4 unlock mutex Anything else is buggy. So, either your condvar use is wrong or you don't really want a condvar in the first place. I haven't checked the code. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFIE2CD2ijCOnn/RHQRAs4kAJ40kbWjNJAzj2gGdbo/sSxZTx5b0ACglbis kw7ST4eJK9CXhNbjKphNsUo= =ISaC -END PGP SIGNATURE- - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] Fix QEMU vcpu thread race with apic_reset
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Ryan Harper wrote: > +/* block until cond_wait occurs */ > +pthread_mutex_lock(&vcpu_mutex); > +/* now we can signal */ > +pthread_cond_signal(&qemu_vcpuup_cond); > +pthread_mutex_unlock(&vcpu_mutex); It is not necessary to take the mutex for a condvar if you're just waking a waiter. This just unnecessarily slows things down. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFIE19D2ijCOnn/RHQRAvLqAJ9j+7ICPHpB/gCthCelDgYn5poGMgCfaZVy +tE5zOuxqlBMUR7Fgufw/wY= =5j4s -END PGP SIGNATURE- - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] Fix QEMU vcpu thread race with apic_reset
Ryan Harper wrote: > There is a race between when the vcpu thread issues a create ioctl and when > apic_reset() gets called resulting in getting a badfd error. > > The problem is indeed there, but the fix is wrong: > main threadvcpu thread > diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c > index 78127de..3513e8c 100644 > --- a/qemu/qemu-kvm.c > +++ b/qemu/qemu-kvm.c > @@ -31,7 +31,9 @@ extern int smp_cpus; > static int qemu_kvm_reset_requested; > > pthread_mutex_t qemu_mutex = PTHREAD_MUTEX_INITIALIZER; > +pthread_mutex_t vcpu_mutex = PTHREAD_MUTEX_INITIALIZER; > pthread_cond_t qemu_aio_cond = PTHREAD_COND_INITIALIZER; > +pthread_cond_t qemu_vcpuup_cond = PTHREAD_COND_INITIALIZER; > __thread struct vcpu_info *vcpu; > > struct qemu_kvm_signal_table { > @@ -369,6 +371,11 @@ static void *ap_main_loop(void *_env) > sigfillset(&signals); > sigprocmask(SIG_BLOCK, &signals, NULL); > kvm_create_vcpu(kvm_context, env->cpu_index); > +/* block until cond_wait occurs */ > +pthread_mutex_lock(&vcpu_mutex); > +/* now we can signal */ > +pthread_cond_signal(&qemu_vcpuup_cond); > +pthread_mutex_unlock(&vcpu_mutex); > kvm_qemu_init_env(env); > kvm_main_loop_cpu(env); > return NULL; > @@ -388,9 +395,10 @@ static void kvm_add_signal(struct qemu_kvm_signal_table > *sigtab, int signum) > > void kvm_init_new_ap(int cpu, CPUState *env) > { > +pthread_mutex_lock(&vcpu_mutex); > pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env); > -/* FIXME: wait for thread to spin up */ > -usleep(200); > +pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex); > pthread_cond_wait() is never correct outside a loop. The signal may arrive before wait is called. The usual idiom is while (condition is not fulfilled) pthread_cond_wait(); I see you have something there to ensure we block, but please use the right idiom. > +pthread_mutex_unlock(&vcpu_mutex); > } > Please reuse qemu_mutex for this, no need for a new one. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] Fix QEMU vcpu thread race with apic_reset
* Ryan Harper <[EMAIL PROTECTED]> [2008-04-26 00:27]: > There is a race between when the vcpu thread issues a create ioctl and when > apic_reset() gets called resulting in getting a badfd error. > > main threadvcpu thread guilt refresh clipped my text short. main threadvcpu thread ------ qemu/hw/pc.c:pc_new_cpu() cpu_init() cpu_x86_init() kvm_init_new_ap() ap_main_loop() *blocks* usleep() apic_init() kvm_set_lapic() kvm_ioctl with unitilized context badfd To fix this, ensure we create the vcpu in the vcpu thread before returning from kvm_init_new_ap. Synchronize on a new mutux, vcpu_mutex, and wait for the vcpuup condition before signaling to ensure the main thread is waiting before we send the signal. With this patch, I can launch 64 kvm guests, 1 second apart and not see any Bad File descriptor errors. Signed-off-by: Ryan Harper <[EMAIL PROTECTED]> -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 [EMAIL PROTECTED] - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] Fix QEMU vcpu thread race with apic_reset
There is a race between when the vcpu thread issues a create ioctl and when apic_reset() gets called resulting in getting a badfd error. main threadvcpu thread diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c index 78127de..3513e8c 100644 --- a/qemu/qemu-kvm.c +++ b/qemu/qemu-kvm.c @@ -31,7 +31,9 @@ extern int smp_cpus; static int qemu_kvm_reset_requested; pthread_mutex_t qemu_mutex = PTHREAD_MUTEX_INITIALIZER; +pthread_mutex_t vcpu_mutex = PTHREAD_MUTEX_INITIALIZER; pthread_cond_t qemu_aio_cond = PTHREAD_COND_INITIALIZER; +pthread_cond_t qemu_vcpuup_cond = PTHREAD_COND_INITIALIZER; __thread struct vcpu_info *vcpu; struct qemu_kvm_signal_table { @@ -369,6 +371,11 @@ static void *ap_main_loop(void *_env) sigfillset(&signals); sigprocmask(SIG_BLOCK, &signals, NULL); kvm_create_vcpu(kvm_context, env->cpu_index); +/* block until cond_wait occurs */ +pthread_mutex_lock(&vcpu_mutex); +/* now we can signal */ +pthread_cond_signal(&qemu_vcpuup_cond); +pthread_mutex_unlock(&vcpu_mutex); kvm_qemu_init_env(env); kvm_main_loop_cpu(env); return NULL; @@ -388,9 +395,10 @@ static void kvm_add_signal(struct qemu_kvm_signal_table *sigtab, int signum) void kvm_init_new_ap(int cpu, CPUState *env) { +pthread_mutex_lock(&vcpu_mutex); pthread_create(&vcpu_info[cpu].thread, NULL, ap_main_loop, env); -/* FIXME: wait for thread to spin up */ -usleep(200); +pthread_cond_wait(&qemu_vcpuup_cond, &vcpu_mutex); +pthread_mutex_unlock(&vcpu_mutex); } static void qemu_kvm_init_signal_tables(void) - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel