Re: [Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request

2014-08-04 Thread Ming Lei
On Mon, Aug 4, 2014 at 6:21 PM, Stefan Hajnoczi  wrote:
> On Fri, Aug 01, 2014 at 03:42:05PM +0800, Ming Lei wrote:
>> On Thu, Jul 31, 2014 at 5:18 PM, Paolo Bonzini  wrote:
>> > Il 31/07/2014 05:22, Ming Lei ha scritto:
>> >>> >
>> >>> > The problem is that g_slice here is not using the slab-style allocator
>> >>> > because the object is larger than roughly 500 bytes.  One solution 
>> >>> > would
>> >>> > be to make virtqueue_pop/vring_pop allocate a VirtQueueElement of the
>> >>> > right size (and virtqueue_push/vring_push free it), as mentioned in the
>> >>> > review of patch 8.
>> >> Unluckily both iovec and addr array can't be fitted into 500 bytes, :-(
>> >> Not mention all users of VirtQueueElement need to be changed too,
>> >> I hate to make that work involved in this patchset, :-)
>> >
>> > Well, the point of dataplane was not just to get maximum iops.  It was
>> > also to provide guidance in the work necessary to improve the code and
>> > get maximum iops without special-casing everything.  This can be a lot
>> > of work indeed.
>> >
>> >>> >
>> >>> > However, I now remembered that VirtQueueElement is a mess because it's
>> >>> > serialized directly into the migration state. :(  So you basically
>> >>> > cannot change it without mucking with migration.  Please leave out 
>> >>> > patch
>> >>> > 8 for now.
>> >> save_device code serializes elem in this way:
>> >>
>> >>  qemu_put_buffer(f, (unsigned char *)&req->elem,
>> >> sizeof(VirtQueueElement));
>> >>
>> >> so I am wondering why this patch may break migration.
>> >
>> > Because you change the on-wire format and break migration from 2.1 to
>> > 2.2.  Sorry, I wasn't clear enough.
>>
>> That is really a mess, but in future we still may convert VirtQueueElement
>> into a smart one, and keep the original structure only for save/load, but
>> a conversion between the two structures is required in save/load.
>
> This is a good idea.  The VirtQueueElement structure isn't complicated,
> it's just big.  Just use the current layout as the serialization format.

The patch shouldn't be complicated too, and the only annoying part is to
find each virtio device's load/save , and add the conversion in  the two
functions if VirtQueueElement is involved.

I suggest to do the conversion in another standalone patchset.

Thanks,



Re: [Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request

2014-08-04 Thread Stefan Hajnoczi
On Fri, Aug 01, 2014 at 03:42:05PM +0800, Ming Lei wrote:
> On Thu, Jul 31, 2014 at 5:18 PM, Paolo Bonzini  wrote:
> > Il 31/07/2014 05:22, Ming Lei ha scritto:
> >>> >
> >>> > The problem is that g_slice here is not using the slab-style allocator
> >>> > because the object is larger than roughly 500 bytes.  One solution would
> >>> > be to make virtqueue_pop/vring_pop allocate a VirtQueueElement of the
> >>> > right size (and virtqueue_push/vring_push free it), as mentioned in the
> >>> > review of patch 8.
> >> Unluckily both iovec and addr array can't be fitted into 500 bytes, :-(
> >> Not mention all users of VirtQueueElement need to be changed too,
> >> I hate to make that work involved in this patchset, :-)
> >
> > Well, the point of dataplane was not just to get maximum iops.  It was
> > also to provide guidance in the work necessary to improve the code and
> > get maximum iops without special-casing everything.  This can be a lot
> > of work indeed.
> >
> >>> >
> >>> > However, I now remembered that VirtQueueElement is a mess because it's
> >>> > serialized directly into the migration state. :(  So you basically
> >>> > cannot change it without mucking with migration.  Please leave out patch
> >>> > 8 for now.
> >> save_device code serializes elem in this way:
> >>
> >>  qemu_put_buffer(f, (unsigned char *)&req->elem,
> >> sizeof(VirtQueueElement));
> >>
> >> so I am wondering why this patch may break migration.
> >
> > Because you change the on-wire format and break migration from 2.1 to
> > 2.2.  Sorry, I wasn't clear enough.
> 
> That is really a mess, but in future we still may convert VirtQueueElement
> into a smart one, and keep the original structure only for save/load, but
> a conversion between the two structures is required in save/load.

This is a good idea.  The VirtQueueElement structure isn't complicated,
it's just big.  Just use the current layout as the serialization format.

Stefan


pgpo19Y2Sg6dO.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request

2014-08-01 Thread Ming Lei
On Thu, Jul 31, 2014 at 5:18 PM, Paolo Bonzini  wrote:
> Il 31/07/2014 05:22, Ming Lei ha scritto:
>>> >
>>> > The problem is that g_slice here is not using the slab-style allocator
>>> > because the object is larger than roughly 500 bytes.  One solution would
>>> > be to make virtqueue_pop/vring_pop allocate a VirtQueueElement of the
>>> > right size (and virtqueue_push/vring_push free it), as mentioned in the
>>> > review of patch 8.
>> Unluckily both iovec and addr array can't be fitted into 500 bytes, :-(
>> Not mention all users of VirtQueueElement need to be changed too,
>> I hate to make that work involved in this patchset, :-)
>
> Well, the point of dataplane was not just to get maximum iops.  It was
> also to provide guidance in the work necessary to improve the code and
> get maximum iops without special-casing everything.  This can be a lot
> of work indeed.
>
>>> >
>>> > However, I now remembered that VirtQueueElement is a mess because it's
>>> > serialized directly into the migration state. :(  So you basically
>>> > cannot change it without mucking with migration.  Please leave out patch
>>> > 8 for now.
>> save_device code serializes elem in this way:
>>
>>  qemu_put_buffer(f, (unsigned char *)&req->elem,
>> sizeof(VirtQueueElement));
>>
>> so I am wondering why this patch may break migration.
>
> Because you change the on-wire format and break migration from 2.1 to
> 2.2.  Sorry, I wasn't clear enough.

That is really a mess, but in future we still may convert VirtQueueElement
into a smart one, and keep the original structure only for save/load, but
a conversion between the two structures is required in save/load.


Thanks,



Re: [Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request

2014-07-31 Thread Paolo Bonzini
Il 31/07/2014 05:22, Ming Lei ha scritto:
>> >
>> > The problem is that g_slice here is not using the slab-style allocator
>> > because the object is larger than roughly 500 bytes.  One solution would
>> > be to make virtqueue_pop/vring_pop allocate a VirtQueueElement of the
>> > right size (and virtqueue_push/vring_push free it), as mentioned in the
>> > review of patch 8.
> Unluckily both iovec and addr array can't be fitted into 500 bytes, :-(
> Not mention all users of VirtQueueElement need to be changed too,
> I hate to make that work involved in this patchset, :-)

Well, the point of dataplane was not just to get maximum iops.  It was
also to provide guidance in the work necessary to improve the code and
get maximum iops without special-casing everything.  This can be a lot
of work indeed.

>> >
>> > However, I now remembered that VirtQueueElement is a mess because it's
>> > serialized directly into the migration state. :(  So you basically
>> > cannot change it without mucking with migration.  Please leave out patch
>> > 8 for now.
> save_device code serializes elem in this way:
> 
>  qemu_put_buffer(f, (unsigned char *)&req->elem,
> sizeof(VirtQueueElement));
> 
> so I am wondering why this patch may break migration.

Because you change the on-wire format and break migration from 2.1 to
2.2.  Sorry, I wasn't clear enough.

Paolo

> And in my test live migration can survive with the patch.
> 




Re: [Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request

2014-07-30 Thread Ming Lei
On Wed, Jul 30, 2014 at 10:14 PM, Paolo Bonzini  wrote:
> Il 30/07/2014 13:39, Ming Lei ha scritto:
>> g_slice_new(VirtIOBlockReq), its free pair and access the instance
>> is a bit slow since sizeof(VirtIOBlockReq) takes more than 40KB,
>> so use object pool to speed up its allocation and release.
>>
>> With this patch, ~5% throughput improvement is observed in the VM
>> based on server.
>>
>> Signed-off-by: Ming Lei 
>> ---
>>  hw/block/dataplane/virtio-blk.c |   12 
>>  hw/block/virtio-blk.c   |   13 +++--
>>  include/hw/virtio/virtio-blk.h  |2 ++
>>  3 files changed, 25 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/block/dataplane/virtio-blk.c 
>> b/hw/block/dataplane/virtio-blk.c
>> index 2093e4a..828fe99 100644
>> --- a/hw/block/dataplane/virtio-blk.c
>> +++ b/hw/block/dataplane/virtio-blk.c
>> @@ -24,6 +24,8 @@
>>  #include "hw/virtio/virtio-bus.h"
>>  #include "qom/object_interfaces.h"
>>
>> +#define REQ_POOL_SZ 128
>> +
>>  struct VirtIOBlockDataPlane {
>>  bool started;
>>  bool starting;
>> @@ -51,6 +53,10 @@ struct VirtIOBlockDataPlane {
>>  Error *blocker;
>>  void (*saved_complete_request)(struct VirtIOBlockReq *req,
>> unsigned char status);
>> +
>> +VirtIOBlockReq  reqs[REQ_POOL_SZ];
>> +void *free_reqs[REQ_POOL_SZ];
>> +ObjPool  req_pool;
>>  };
>>
>>  /* Raise an interrupt to signal guest, if necessary */
>> @@ -238,6 +244,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane 
>> *s)
>>  return;
>>  }
>>
>> +vblk->obj_pool = &s->req_pool;
>> +obj_pool_init(vblk->obj_pool, s->reqs, s->free_reqs,
>> +  sizeof(VirtIOBlockReq), REQ_POOL_SZ);
>> +
>>  /* Set up guest notifier (irq) */
>>  if (k->set_guest_notifiers(qbus->parent, 1, true) != 0) {
>>  fprintf(stderr, "virtio-blk failed to set guest notifier, "
>> @@ -298,6 +308,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>>
>>  aio_context_release(s->ctx);
>>
>> +vblk->obj_pool = NULL;
>> +
>>  if (s->raw_format) {
>>  qemu_aio_set_bypass_co(s->ctx, false);
>>  }
>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>> index c241c50..2a11bc4 100644
>> --- a/hw/block/virtio-blk.c
>> +++ b/hw/block/virtio-blk.c
>> @@ -31,7 +31,11 @@
>>
>>  VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
>>  {
>> -VirtIOBlockReq *req = g_slice_new(VirtIOBlockReq);
>> +VirtIOBlockReq *req = obj_pool_get(s->obj_pool);
>> +
>> +if (!req) {
>> +req = g_slice_new(VirtIOBlockReq);
>> +}
>>  req->dev = s;
>>  req->qiov.size = 0;
>>  req->next = NULL;
>> @@ -41,7 +45,11 @@ VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
>>  void virtio_blk_free_request(VirtIOBlockReq *req)
>>  {
>>  if (req) {
>> -g_slice_free(VirtIOBlockReq, req);
>> +if (obj_pool_has_obj(req->dev->obj_pool, req)) {
>> +obj_pool_put(req->dev->obj_pool, req);
>> +} else {
>> +g_slice_free(VirtIOBlockReq, req);
>> +}
>>  }
>>  }
>>
>> @@ -801,6 +809,7 @@ static void virtio_blk_instance_init(Object *obj)
>>  {
>>  VirtIOBlock *s = VIRTIO_BLK(obj);
>>
>> +s->obj_pool = NULL;
>>  object_property_add_link(obj, "iothread", TYPE_IOTHREAD,
>>   (Object **)&s->blk.iothread,
>>   qdev_prop_allow_set_link_before_realize,
>> diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
>> index afb7b8d..49ac234 100644
>> --- a/include/hw/virtio/virtio-blk.h
>> +++ b/include/hw/virtio/virtio-blk.h
>> @@ -18,6 +18,7 @@
>>  #include "hw/block/block.h"
>>  #include "sysemu/iothread.h"
>>  #include "block/block.h"
>> +#include "qemu/obj_pool.h"
>>
>>  #define TYPE_VIRTIO_BLK "virtio-blk-device"
>>  #define VIRTIO_BLK(obj) \
>> @@ -135,6 +136,7 @@ typedef struct VirtIOBlock {
>>  Notifier migration_state_notifier;
>>  struct VirtIOBlockDataPlane *dataplane;
>>  #endif
>> +ObjPool *obj_pool;
>>  } VirtIOBlock;
>>
>>  typedef struct MultiReqBuffer {
>>
>
> The problem is that g_slice here is not using the slab-style allocator
> because the object is larger than roughly 500 bytes.  One solution would
> be to make virtqueue_pop/vring_pop allocate a VirtQueueElement of the
> right size (and virtqueue_push/vring_push free it), as mentioned in the
> review of patch 8.

Unluckily both iovec and addr array can't be fitted into 500 bytes, :-(
Not mention all users of VirtQueueElement need to be changed too,
I hate to make that work involved in this patchset, :-)

>
> However, I now remembered that VirtQueueElement is a mess because it's
> serialized directly into the migration state. :(  So you basically
> cannot change it without mucking with migration.  Please leave out patch
> 8 for now.

save_device code serializes elem in this way:

 qemu_put_buffer(f, (unsigned char *)&req->elem,
sizeof(Vi

Re: [Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request

2014-07-30 Thread Michael S. Tsirkin
On Wed, Jul 30, 2014 at 04:14:09PM +0200, Paolo Bonzini wrote:
> Il 30/07/2014 13:39, Ming Lei ha scritto:
> > g_slice_new(VirtIOBlockReq), its free pair and access the instance
> > is a bit slow since sizeof(VirtIOBlockReq) takes more than 40KB,
> > so use object pool to speed up its allocation and release.
> > 
> > With this patch, ~5% throughput improvement is observed in the VM
> > based on server.
> > 
> > Signed-off-by: Ming Lei 
> > ---
> >  hw/block/dataplane/virtio-blk.c |   12 
> >  hw/block/virtio-blk.c   |   13 +++--
> >  include/hw/virtio/virtio-blk.h  |2 ++
> >  3 files changed, 25 insertions(+), 2 deletions(-)
> > 
> > diff --git a/hw/block/dataplane/virtio-blk.c 
> > b/hw/block/dataplane/virtio-blk.c
> > index 2093e4a..828fe99 100644
> > --- a/hw/block/dataplane/virtio-blk.c
> > +++ b/hw/block/dataplane/virtio-blk.c
> > @@ -24,6 +24,8 @@
> >  #include "hw/virtio/virtio-bus.h"
> >  #include "qom/object_interfaces.h"
> >  
> > +#define REQ_POOL_SZ 128
> > +
> >  struct VirtIOBlockDataPlane {
> >  bool started;
> >  bool starting;
> > @@ -51,6 +53,10 @@ struct VirtIOBlockDataPlane {
> >  Error *blocker;
> >  void (*saved_complete_request)(struct VirtIOBlockReq *req,
> > unsigned char status);
> > +
> > +VirtIOBlockReq  reqs[REQ_POOL_SZ];
> > +void *free_reqs[REQ_POOL_SZ];
> > +ObjPool  req_pool;
> >  };
> >  
> >  /* Raise an interrupt to signal guest, if necessary */
> > @@ -238,6 +244,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane 
> > *s)
> >  return;
> >  }
> >  
> > +vblk->obj_pool = &s->req_pool;
> > +obj_pool_init(vblk->obj_pool, s->reqs, s->free_reqs,
> > +  sizeof(VirtIOBlockReq), REQ_POOL_SZ);
> > +
> >  /* Set up guest notifier (irq) */
> >  if (k->set_guest_notifiers(qbus->parent, 1, true) != 0) {
> >  fprintf(stderr, "virtio-blk failed to set guest notifier, "
> > @@ -298,6 +308,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
> >  
> >  aio_context_release(s->ctx);
> >  
> > +vblk->obj_pool = NULL;
> > +
> >  if (s->raw_format) {
> >  qemu_aio_set_bypass_co(s->ctx, false);
> >  }
> > diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> > index c241c50..2a11bc4 100644
> > --- a/hw/block/virtio-blk.c
> > +++ b/hw/block/virtio-blk.c
> > @@ -31,7 +31,11 @@
> >  
> >  VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
> >  {
> > -VirtIOBlockReq *req = g_slice_new(VirtIOBlockReq);
> > +VirtIOBlockReq *req = obj_pool_get(s->obj_pool);
> > +
> > +if (!req) {
> > +req = g_slice_new(VirtIOBlockReq);
> > +}
> >  req->dev = s;
> >  req->qiov.size = 0;
> >  req->next = NULL;
> > @@ -41,7 +45,11 @@ VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
> >  void virtio_blk_free_request(VirtIOBlockReq *req)
> >  {
> >  if (req) {
> > -g_slice_free(VirtIOBlockReq, req);
> > +if (obj_pool_has_obj(req->dev->obj_pool, req)) {
> > +obj_pool_put(req->dev->obj_pool, req);
> > +} else {
> > +g_slice_free(VirtIOBlockReq, req);
> > +}
> >  }
> >  }
> >  
> > @@ -801,6 +809,7 @@ static void virtio_blk_instance_init(Object *obj)
> >  {
> >  VirtIOBlock *s = VIRTIO_BLK(obj);
> >  
> > +s->obj_pool = NULL;
> >  object_property_add_link(obj, "iothread", TYPE_IOTHREAD,
> >   (Object **)&s->blk.iothread,
> >   qdev_prop_allow_set_link_before_realize,
> > diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
> > index afb7b8d..49ac234 100644
> > --- a/include/hw/virtio/virtio-blk.h
> > +++ b/include/hw/virtio/virtio-blk.h
> > @@ -18,6 +18,7 @@
> >  #include "hw/block/block.h"
> >  #include "sysemu/iothread.h"
> >  #include "block/block.h"
> > +#include "qemu/obj_pool.h"
> >  
> >  #define TYPE_VIRTIO_BLK "virtio-blk-device"
> >  #define VIRTIO_BLK(obj) \
> > @@ -135,6 +136,7 @@ typedef struct VirtIOBlock {
> >  Notifier migration_state_notifier;
> >  struct VirtIOBlockDataPlane *dataplane;
> >  #endif
> > +ObjPool *obj_pool;
> >  } VirtIOBlock;
> >  
> >  typedef struct MultiReqBuffer {
> > 
> 
> The problem is that g_slice here is not using the slab-style allocator
> because the object is larger than roughly 500 bytes.  One solution would
> be to make virtqueue_pop/vring_pop allocate a VirtQueueElement of the
> right size (and virtqueue_push/vring_push free it), as mentioned in the
> review of patch 8.
> 
> However, I now remembered that VirtQueueElement is a mess because it's
> serialized directly into the migration state. :(

BTW, this needs to be fixed anyway.

> So you basically
> cannot change it without mucking with migration.  Please leave out patch
> 8 for now.
> 
> Let's use this object pool, however, please simplify the API, it should
> be just:
> 
> void obj_pool_init(ObjPool *pool, unsigned

Re: [Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request

2014-07-30 Thread Paolo Bonzini
Il 30/07/2014 13:39, Ming Lei ha scritto:
> g_slice_new(VirtIOBlockReq), its free pair and access the instance
> is a bit slow since sizeof(VirtIOBlockReq) takes more than 40KB,
> so use object pool to speed up its allocation and release.
> 
> With this patch, ~5% throughput improvement is observed in the VM
> based on server.
> 
> Signed-off-by: Ming Lei 
> ---
>  hw/block/dataplane/virtio-blk.c |   12 
>  hw/block/virtio-blk.c   |   13 +++--
>  include/hw/virtio/virtio-blk.h  |2 ++
>  3 files changed, 25 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 2093e4a..828fe99 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -24,6 +24,8 @@
>  #include "hw/virtio/virtio-bus.h"
>  #include "qom/object_interfaces.h"
>  
> +#define REQ_POOL_SZ 128
> +
>  struct VirtIOBlockDataPlane {
>  bool started;
>  bool starting;
> @@ -51,6 +53,10 @@ struct VirtIOBlockDataPlane {
>  Error *blocker;
>  void (*saved_complete_request)(struct VirtIOBlockReq *req,
> unsigned char status);
> +
> +VirtIOBlockReq  reqs[REQ_POOL_SZ];
> +void *free_reqs[REQ_POOL_SZ];
> +ObjPool  req_pool;
>  };
>  
>  /* Raise an interrupt to signal guest, if necessary */
> @@ -238,6 +244,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
>  return;
>  }
>  
> +vblk->obj_pool = &s->req_pool;
> +obj_pool_init(vblk->obj_pool, s->reqs, s->free_reqs,
> +  sizeof(VirtIOBlockReq), REQ_POOL_SZ);
> +
>  /* Set up guest notifier (irq) */
>  if (k->set_guest_notifiers(qbus->parent, 1, true) != 0) {
>  fprintf(stderr, "virtio-blk failed to set guest notifier, "
> @@ -298,6 +308,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
>  
>  aio_context_release(s->ctx);
>  
> +vblk->obj_pool = NULL;
> +
>  if (s->raw_format) {
>  qemu_aio_set_bypass_co(s->ctx, false);
>  }
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index c241c50..2a11bc4 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -31,7 +31,11 @@
>  
>  VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
>  {
> -VirtIOBlockReq *req = g_slice_new(VirtIOBlockReq);
> +VirtIOBlockReq *req = obj_pool_get(s->obj_pool);
> +
> +if (!req) {
> +req = g_slice_new(VirtIOBlockReq);
> +}
>  req->dev = s;
>  req->qiov.size = 0;
>  req->next = NULL;
> @@ -41,7 +45,11 @@ VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
>  void virtio_blk_free_request(VirtIOBlockReq *req)
>  {
>  if (req) {
> -g_slice_free(VirtIOBlockReq, req);
> +if (obj_pool_has_obj(req->dev->obj_pool, req)) {
> +obj_pool_put(req->dev->obj_pool, req);
> +} else {
> +g_slice_free(VirtIOBlockReq, req);
> +}
>  }
>  }
>  
> @@ -801,6 +809,7 @@ static void virtio_blk_instance_init(Object *obj)
>  {
>  VirtIOBlock *s = VIRTIO_BLK(obj);
>  
> +s->obj_pool = NULL;
>  object_property_add_link(obj, "iothread", TYPE_IOTHREAD,
>   (Object **)&s->blk.iothread,
>   qdev_prop_allow_set_link_before_realize,
> diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
> index afb7b8d..49ac234 100644
> --- a/include/hw/virtio/virtio-blk.h
> +++ b/include/hw/virtio/virtio-blk.h
> @@ -18,6 +18,7 @@
>  #include "hw/block/block.h"
>  #include "sysemu/iothread.h"
>  #include "block/block.h"
> +#include "qemu/obj_pool.h"
>  
>  #define TYPE_VIRTIO_BLK "virtio-blk-device"
>  #define VIRTIO_BLK(obj) \
> @@ -135,6 +136,7 @@ typedef struct VirtIOBlock {
>  Notifier migration_state_notifier;
>  struct VirtIOBlockDataPlane *dataplane;
>  #endif
> +ObjPool *obj_pool;
>  } VirtIOBlock;
>  
>  typedef struct MultiReqBuffer {
> 

The problem is that g_slice here is not using the slab-style allocator
because the object is larger than roughly 500 bytes.  One solution would
be to make virtqueue_pop/vring_pop allocate a VirtQueueElement of the
right size (and virtqueue_push/vring_push free it), as mentioned in the
review of patch 8.

However, I now remembered that VirtQueueElement is a mess because it's
serialized directly into the migration state. :(  So you basically
cannot change it without mucking with migration.  Please leave out patch
8 for now.

Let's use this object pool, however, please simplify the API, it should
be just:

void obj_pool_init(ObjPool *pool, unsigned obj_size, unsigned cnt);
void *obj_pool_alloc(ObjPool *pool);
void obj_pool_free(ObjPool *pool, void *obj);
void obj_pool_destroy(ObjPool *pool);

All allocations of the objs buffer, and all logic like

+VirtIOBlockReq *req = obj_pool_get(s->obj_pool);
+
+if (!req) {
+req = g_slice_new(VirtIOBlockReq);
+}

+if (obj_pool_has_obj(req->dev->obj_pool, r

[Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request

2014-07-30 Thread Ming Lei
g_slice_new(VirtIOBlockReq), its free pair and access the instance
is a bit slow since sizeof(VirtIOBlockReq) takes more than 40KB,
so use object pool to speed up its allocation and release.

With this patch, ~5% throughput improvement is observed in the VM
based on server.

Signed-off-by: Ming Lei 
---
 hw/block/dataplane/virtio-blk.c |   12 
 hw/block/virtio-blk.c   |   13 +++--
 include/hw/virtio/virtio-blk.h  |2 ++
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 2093e4a..828fe99 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -24,6 +24,8 @@
 #include "hw/virtio/virtio-bus.h"
 #include "qom/object_interfaces.h"
 
+#define REQ_POOL_SZ 128
+
 struct VirtIOBlockDataPlane {
 bool started;
 bool starting;
@@ -51,6 +53,10 @@ struct VirtIOBlockDataPlane {
 Error *blocker;
 void (*saved_complete_request)(struct VirtIOBlockReq *req,
unsigned char status);
+
+VirtIOBlockReq  reqs[REQ_POOL_SZ];
+void *free_reqs[REQ_POOL_SZ];
+ObjPool  req_pool;
 };
 
 /* Raise an interrupt to signal guest, if necessary */
@@ -238,6 +244,10 @@ void virtio_blk_data_plane_start(VirtIOBlockDataPlane *s)
 return;
 }
 
+vblk->obj_pool = &s->req_pool;
+obj_pool_init(vblk->obj_pool, s->reqs, s->free_reqs,
+  sizeof(VirtIOBlockReq), REQ_POOL_SZ);
+
 /* Set up guest notifier (irq) */
 if (k->set_guest_notifiers(qbus->parent, 1, true) != 0) {
 fprintf(stderr, "virtio-blk failed to set guest notifier, "
@@ -298,6 +308,8 @@ void virtio_blk_data_plane_stop(VirtIOBlockDataPlane *s)
 
 aio_context_release(s->ctx);
 
+vblk->obj_pool = NULL;
+
 if (s->raw_format) {
 qemu_aio_set_bypass_co(s->ctx, false);
 }
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c241c50..2a11bc4 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -31,7 +31,11 @@
 
 VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
 {
-VirtIOBlockReq *req = g_slice_new(VirtIOBlockReq);
+VirtIOBlockReq *req = obj_pool_get(s->obj_pool);
+
+if (!req) {
+req = g_slice_new(VirtIOBlockReq);
+}
 req->dev = s;
 req->qiov.size = 0;
 req->next = NULL;
@@ -41,7 +45,11 @@ VirtIOBlockReq *virtio_blk_alloc_request(VirtIOBlock *s)
 void virtio_blk_free_request(VirtIOBlockReq *req)
 {
 if (req) {
-g_slice_free(VirtIOBlockReq, req);
+if (obj_pool_has_obj(req->dev->obj_pool, req)) {
+obj_pool_put(req->dev->obj_pool, req);
+} else {
+g_slice_free(VirtIOBlockReq, req);
+}
 }
 }
 
@@ -801,6 +809,7 @@ static void virtio_blk_instance_init(Object *obj)
 {
 VirtIOBlock *s = VIRTIO_BLK(obj);
 
+s->obj_pool = NULL;
 object_property_add_link(obj, "iothread", TYPE_IOTHREAD,
  (Object **)&s->blk.iothread,
  qdev_prop_allow_set_link_before_realize,
diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index afb7b8d..49ac234 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -18,6 +18,7 @@
 #include "hw/block/block.h"
 #include "sysemu/iothread.h"
 #include "block/block.h"
+#include "qemu/obj_pool.h"
 
 #define TYPE_VIRTIO_BLK "virtio-blk-device"
 #define VIRTIO_BLK(obj) \
@@ -135,6 +136,7 @@ typedef struct VirtIOBlock {
 Notifier migration_state_notifier;
 struct VirtIOBlockDataPlane *dataplane;
 #endif
+ObjPool *obj_pool;
 } VirtIOBlock;
 
 typedef struct MultiReqBuffer {
-- 
1.7.9.5