Am 28.11.2014 um 15:12 schrieb Paolo Bonzini: > From: Peter Lieven <p...@kamp.de> > > Placing coroutines on the global pool should be preferrable, because it > can help all threads. But if the global pool is full, we can still > try to save some allocations by stashing completed coroutines on the > local pool. This is quite cheap too, because it does not require > atomic operations.
At least in test-couroutine.c this turns out to be not just a nice to have. I have not fully understood why, but i get the following results: master: Run operation 40000000 iterations 13.612604 s, 2938K operations/s, 340ns per coroutine this series up to patch 6: Run operation 40000000 iterations 10.428382 s, 3835K operations/s, 260ns per coroutine this series up to patch 7: Run operation 40000000 iterations 9.112539 s, 4389K operations/s, 227ns per coroutine So this confirms the +33% Paolo sees up to Patch 5. But I have yet fully understood the +15% that this Patch gains. > > Signed-off-by: Peter Lieven <p...@kamp.de> > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> > --- > qemu-coroutine.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/qemu-coroutine.c b/qemu-coroutine.c > index da1b961..977f114 100644 > --- a/qemu-coroutine.c > +++ b/qemu-coroutine.c > @@ -27,6 +27,7 @@ enum { > static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool); > static unsigned int release_pool_size; > static __thread QSLIST_HEAD(, Coroutine) alloc_pool = > QSLIST_HEAD_INITIALIZER(pool); > +static __thread unsigned int alloc_pool_size; > static __thread Notifier coroutine_pool_cleanup_notifier; > > static void coroutine_pool_cleanup(Notifier *n, void *value) > @@ -58,13 +59,14 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry) > * release_pool_size and the actual size of release_pool. > But > * it is just a heuristic, it does not need to be perfect. > */ > - release_pool_size = 0; > + alloc_pool_size += atomic_xchg(&release_pool_size, 0); I had alloc_pool_size = in my original Patch. It shouldn't make a difference, since alloc_pool_size should be 0 when we reach this code piece. But if for some reason release_pool_size is inaccurate we add this error to alloc_pool_size again and again and eventually end up not adding coroutines to the thread local pool below altough it might be empty in the worst case. Peter