Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2

Ming Lei Wed, 02 Jul 2014 21:55:30 -0700

On Thu, Jul 3, 2014 at 12:21 AM, Paolo Bonzini <pbonz...@redhat.com> wrote:
> Il 02/07/2014 17:45, Ming Lei ha scritto:
>> The attachment debug patch skips aio_notify() if qemu_bh_schedule
>> is running from current aio context, but looks there is still 120K
>> writes triggered. (without the patch, 400K can be observed in
>> same test)
>
> Nice.  Another observation is that after aio_dispatch we'll always
> re-evaluate everything (bottom halves, file descriptors and timeouts),


The idea is very good.

If aio_notify() is called from the 1st aio_dispatch() in aio_poll(),
ctc->notifier might need to be set, but it can be handled easily.

> so we can skip the aio_notify if we're inside aio_dispatch.
>
> So what about this untested patch:
>
> diff --git a/aio-posix.c b/aio-posix.c
> index f921d4f..a23d85d 100644
> --- a/aio-posix.c
> +++ b/aio-posix.c

#include "qemu/atomic.h"

> @@ -124,6 +124,9 @@ static bool aio_dispatch(AioContext *ctx)
>      AioHandler *node;
>      bool progress = false;
>
> +    /* No need to set the event notifier during aio_notify.  */
> +    ctx->running++;
> +
>      /*
>       * We have to walk very carefully in case qemu_aio_set_fd_handler is
>       * called while we're walking.
> @@ -169,6 +171,11 @@ static bool aio_dispatch(AioContext *ctx)
>      /* Run our timers */
>      progress |= timerlistgroup_run_timers(&ctx->tlg);
>
> +    smp_wmb();
> +    ctx->iter_count++;
> +    smp_wmb();
> +    ctx->running--;
> +
>      return progress;
>  }
>
> diff --git a/async.c b/async.c
> index 5b6fe6b..1f56afa 100644
> --- a/async.c
> +++ b/async.c

#include "qemu/atomic.h"

> @@ -249,7 +249,19 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
>
>  void aio_notify(AioContext *ctx)
>  {
> -    event_notifier_set(&ctx->notifier);
> +    uint32_t iter_count;
> +    do {
> +        iter_count = ctx->iter_count;
> +        /* Read ctx->iter_count before ctx->running.  */
> +        smb_rmb();

s/smb/smp

> +        if (!ctx->running) {
> +            event_notifier_set(&ctx->notifier);
> +            return;
> +        }
> +        /* Read ctx->running before ctx->iter_count.  */
> +        smb_rmb();

s/smb/smp

> +        /* ctx might have gone to sleep.  */
> +    } while (iter_count != ctx->iter_count);
>  }

Since both 'running' and 'iter_count'  may be read lockless, something
like ACCESS_ONCE() should be used to avoid compiler optimization.

>
>  static void aio_timerlist_notify(void *opaque)
> @@ -269,6 +279,7 @@ AioContext *aio_context_new(void)
>      ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
>      ctx->pollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
>      ctx->thread_pool = NULL;
> +    ctx->iter_count = ctx->running = 0;
>      qemu_mutex_init(&ctx->bh_lock);
>      rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
>      event_notifier_init(&ctx->notifier, false);
> diff --git a/include/block/aio.h b/include/block/aio.h
> index a92511b..9f51c4f 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -51,6 +51,9 @@ struct AioContext {
>      /* Protects all fields from multi-threaded access */
>      RFifoLock lock;
>
> +    /* Used to avoid aio_notify while dispatching event handlers.
> +     * Writes protected by lock or BQL, reads are lockless.
> +     */
> +    uint32_t iter_count, running;
> +
>      /* The list of registered AIO handlers */
>      QLIST_HEAD(, AioHandler) aio_handlers;
>

In my test, it does decrease write() very much, and I hope
a formal version can be applied soon.


Thanks,
-- 
Ming Lei

Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2

Reply via email to