Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support

2012-07-19 Thread Anthony Liguori
Asias He  writes:

> vhost-blk is a in kernel virito-blk device accelerator.
>
> This patch is based on Liu Yuan's implementation with various
> improvements and bug fixes. Notably, this patch makes guest notify and
> host completion processing in parallel which gives about 60% performance
> improvement compared to Liu Yuan's implementation.
>
> Performance evaluation:
> -
> The comparison is between kvm tool with usersapce implementation and kvm
> tool with vhost-blk.
>
> 1) Fio with libaio ioengine on Fusion IO device
> With bio-based IO path, sequential read/write, random read/write
> IOPS boost : 8.4%, 15.3%, 10.4%, 14.6%
> Latency improvement: 8.5%, 15.4%, 10.4%, 15.1%
>
> 2) Fio with vsync ioengine on Fusion IO device
> With bio-based IO path, sequential read/write, random read/write
> IOPS boost : 10.5%, 4.8%, 5.2%, 5.6%
> Latency improvement: 11.4%, 5.0%, 5.2%, 5.8%
>
> Cc: Michael S. Tsirkin 
> Cc: linux-ker...@vger.kernel.org
> Cc: k...@vger.kernel.org
> Cc: virtualization@lists.linux-foundation.org
> Signed-off-by: Asias He 
> ---
>  drivers/vhost/Kconfig  |   10 +
>  drivers/vhost/Makefile |2 +
>  drivers/vhost/blk.c|  600 
> 
>  drivers/vhost/vhost.h  |5 +
>  include/linux/vhost.h  |3 +
>  5 files changed, 620 insertions(+)
>  create mode 100644 drivers/vhost/blk.c
>
> diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> index c387067..fa071a8 100644
> --- a/drivers/vhost/Kconfig
> +++ b/drivers/vhost/Kconfig
> @@ -16,4 +16,14 @@ config VHOST_NET
>  
> To compile this driver as a module, choose M here: the module will
> be called vhost_net.
> +config VHOST_BLK
> + tristate "Host kernel accelerator for virtio blk (EXPERIMENTAL)"
> + depends on VHOST && BLOCK && AIO && EVENTFD && EXPERIMENTAL
> + ---help---
> +   This kernel module can be loaded in host kernel to accelerate
> +   guest block with virtio_blk. Not to be confused with virtio_blk
> +   module itself which needs to be loaded in guest kernel.
> +
> +   To compile this driver as a module, choose M here: the module will
> +   be called vhost_blk.
>  
> diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
> index cd36885..aa461d5 100644
> --- a/drivers/vhost/Makefile
> +++ b/drivers/vhost/Makefile
> @@ -1,4 +1,6 @@
>  obj-$(CONFIG_VHOST)  += vhost.o
>  obj-$(CONFIG_VHOST_NET) += vhost_net.o
> +obj-$(CONFIG_VHOST_BLK) += vhost_blk.o
>  
>  vhost_net-y  := net.o
> +vhost_blk-y  := blk.o
> diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c
> new file mode 100644
> index 000..6a94894
> --- /dev/null
> +++ b/drivers/vhost/blk.c
> @@ -0,0 +1,600 @@
> +/*
> + * Copyright (C) 2011 Taobao, Inc.
> + * Author: Liu Yuan 
> + *
> + * Copyright (C) 2012 Red Hat, Inc.
> + * Author: Asias He 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.
> + *
> + * virtio-blk server in host kernel.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "vhost.h"
> +
> +#define BLK_HDR  0
> +
> +enum {
> + VHOST_BLK_VQ_REQ = 0,
> + VHOST_BLK_VQ_MAX = 1,
> +};
> +
> +struct vhost_blk_req {
> + u16 head;
> + u8 *status;
> +};
> +
> +struct vhost_blk {
> + struct task_struct *worker_host_kick;
> + struct task_struct *worker;
> + struct vhost_blk_req *reqs;
> + struct vhost_virtqueue vq;
> + struct eventfd_ctx *ectx;
> + struct io_event *ioevent;
> + struct kioctx *ioctx;
> + struct vhost_dev dev;
> + struct file *efile;
> + u64 ioevent_nr;
> + bool stop;
> +};
> +
> +static inline int vhost_blk_read_events(struct vhost_blk *blk, long nr)
> +{
> + mm_segment_t old_fs = get_fs();
> + int ret;
> +
> + set_fs(KERNEL_DS);
> + ret = read_events(blk->ioctx, nr, nr, blk->ioevent, NULL);
> + set_fs(old_fs);
> +
> + return ret;
> +}
> +
> +static int vhost_blk_setup(struct vhost_blk *blk)
> +{
> + struct kioctx *ctx;
> +
> + if (blk->ioctx)
> + return 0;
> +
> + blk->ioevent_nr = blk->vq.num;
> + ctx = ioctx_alloc(blk->ioevent_nr);
> + if (IS_ERR(ctx)) {
> + pr_err("Failed to ioctx_alloc");
> + return PTR_ERR(ctx);
> + }

Not that it's very likely that ioctx_alloc will fail in practice.
There's a fixed number of events that can be allocated that's currently
0x1.  If you have a ring queue size of 1024 (which is normal) then
that limits you to 64 vhost-blk devices.

Realistically, I don't think you can only do aio with vhost-blk because
of this (and many other) limitations.  It's necessary to be able to fall
back to a thread pool because AIO cannot be relied upon.

> + put_ioctx(ctx);
> + blk->ioctx = ctx;
> +
> + blk->ioevent = kmalloc(sizeof(struct io_event) * blk->ioevent_nr,
> +  

Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support

2012-07-19 Thread Michael S. Tsirkin
On Thu, Jul 19, 2012 at 08:05:42AM -0500, Anthony Liguori wrote:
> Of course, the million dollar question is why would using AIO in the
> kernel be faster than using AIO in userspace?

Actually for me a more important question is how does it compare
with virtio-blk dataplane?

-- 
MST
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support

2012-07-19 Thread Michael S. Tsirkin
On Thu, Jul 19, 2012 at 08:05:42AM -0500, Anthony Liguori wrote:
> Asias He  writes:
> 
> > vhost-blk is a in kernel virito-blk device accelerator.
> >
> > This patch is based on Liu Yuan's implementation with various
> > improvements and bug fixes. Notably, this patch makes guest notify and
> > host completion processing in parallel which gives about 60% performance
> > improvement compared to Liu Yuan's implementation.
> >
> > Performance evaluation:
> > -
> > The comparison is between kvm tool with usersapce implementation and kvm
> > tool with vhost-blk.
> >
> > 1) Fio with libaio ioengine on Fusion IO device
> > With bio-based IO path, sequential read/write, random read/write
> > IOPS boost : 8.4%, 15.3%, 10.4%, 14.6%
> > Latency improvement: 8.5%, 15.4%, 10.4%, 15.1%
> >
> > 2) Fio with vsync ioengine on Fusion IO device
> > With bio-based IO path, sequential read/write, random read/write
> > IOPS boost : 10.5%, 4.8%, 5.2%, 5.6%
> > Latency improvement: 11.4%, 5.0%, 5.2%, 5.8%
> >
> > Cc: Michael S. Tsirkin 
> > Cc: linux-ker...@vger.kernel.org
> > Cc: k...@vger.kernel.org
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Asias He 
> > ---
> >  drivers/vhost/Kconfig  |   10 +
> >  drivers/vhost/Makefile |2 +
> >  drivers/vhost/blk.c|  600 
> > 
> >  drivers/vhost/vhost.h  |5 +
> >  include/linux/vhost.h  |3 +
> >  5 files changed, 620 insertions(+)
> >  create mode 100644 drivers/vhost/blk.c
> >
> > diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
> > index c387067..fa071a8 100644
> > --- a/drivers/vhost/Kconfig
> > +++ b/drivers/vhost/Kconfig
> > @@ -16,4 +16,14 @@ config VHOST_NET
> >  
> >   To compile this driver as a module, choose M here: the module will
> >   be called vhost_net.
> > +config VHOST_BLK
> > +   tristate "Host kernel accelerator for virtio blk (EXPERIMENTAL)"
> > +   depends on VHOST && BLOCK && AIO && EVENTFD && EXPERIMENTAL
> > +   ---help---
> > + This kernel module can be loaded in host kernel to accelerate
> > + guest block with virtio_blk. Not to be confused with virtio_blk
> > + module itself which needs to be loaded in guest kernel.
> > +
> > + To compile this driver as a module, choose M here: the module will
> > + be called vhost_blk.
> >  
> > diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile
> > index cd36885..aa461d5 100644
> > --- a/drivers/vhost/Makefile
> > +++ b/drivers/vhost/Makefile
> > @@ -1,4 +1,6 @@
> >  obj-$(CONFIG_VHOST)+= vhost.o
> >  obj-$(CONFIG_VHOST_NET) += vhost_net.o
> > +obj-$(CONFIG_VHOST_BLK) += vhost_blk.o
> >  
> >  vhost_net-y:= net.o
> > +vhost_blk-y:= blk.o
> > diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c
> > new file mode 100644
> > index 000..6a94894
> > --- /dev/null
> > +++ b/drivers/vhost/blk.c
> > @@ -0,0 +1,600 @@
> > +/*
> > + * Copyright (C) 2011 Taobao, Inc.
> > + * Author: Liu Yuan 
> > + *
> > + * Copyright (C) 2012 Red Hat, Inc.
> > + * Author: Asias He 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2.
> > + *
> > + * virtio-blk server in host kernel.
> > + */
> > +
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "vhost.h"
> > +
> > +#define BLK_HDR0
> > +
> > +enum {
> > +   VHOST_BLK_VQ_REQ = 0,
> > +   VHOST_BLK_VQ_MAX = 1,
> > +};
> > +
> > +struct vhost_blk_req {
> > +   u16 head;
> > +   u8 *status;
> > +};
> > +
> > +struct vhost_blk {
> > +   struct task_struct *worker_host_kick;
> > +   struct task_struct *worker;
> > +   struct vhost_blk_req *reqs;
> > +   struct vhost_virtqueue vq;
> > +   struct eventfd_ctx *ectx;
> > +   struct io_event *ioevent;
> > +   struct kioctx *ioctx;
> > +   struct vhost_dev dev;
> > +   struct file *efile;
> > +   u64 ioevent_nr;
> > +   bool stop;
> > +};
> > +
> > +static inline int vhost_blk_read_events(struct vhost_blk *blk, long nr)
> > +{
> > +   mm_segment_t old_fs = get_fs();
> > +   int ret;
> > +
> > +   set_fs(KERNEL_DS);
> > +   ret = read_events(blk->ioctx, nr, nr, blk->ioevent, NULL);
> > +   set_fs(old_fs);
> > +
> > +   return ret;
> > +}
> > +
> > +static int vhost_blk_setup(struct vhost_blk *blk)
> > +{
> > +   struct kioctx *ctx;
> > +
> > +   if (blk->ioctx)
> > +   return 0;
> > +
> > +   blk->ioevent_nr = blk->vq.num;
> > +   ctx = ioctx_alloc(blk->ioevent_nr);
> > +   if (IS_ERR(ctx)) {
> > +   pr_err("Failed to ioctx_alloc");
> > +   return PTR_ERR(ctx);
> > +   }
> 
> Not that it's very likely that ioctx_alloc will fail in practice.
> There's a fixed number of events that can be allocated that's currently
> 0x1.  If you have a ring queue size of 1024 (which is normal) then
> that limits you to 64 vhost-blk devices.
> 
> Realistically, I don'

Re: [PATCH v4 1/3] mm: introduce compaction and migration for virtio ballooned pages

2012-07-19 Thread Rafael Aquini
On Wed, Jul 18, 2012 at 06:29:44PM -0700, Andrew Morton wrote:
> On Wed, 18 Jul 2012 22:00:48 -0300 Rafael Aquini  wrote:
> 
> > > So the function needs a better name - one which communicates that it is
> > > a balloon page *for the purposes of processing by the compaction code*. 
> > > Making the function private to compaction.c would help with that, if
> > > feasible.
> > > 
> > 
> > How about this (adjusted) approach:
> 
> it fails checkpatch ;)
>
Ugh! it fails due to a lacking whitespace... will fix that right away.
 
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1629,8 +1629,7 @@ static inline unsigned int 
> > debug_guardpage_minorder(void)
> > { return 0; }
> >  static inline bool page_is_guard(struct page *page) { return false; }
> >  #endif /* CONFIG_DEBUG_PAGEALLOC */
> >  
> > -#if (defined(CONFIG_VIRTIO_BALLOON) || \
> > -   defined(CONFIG_VIRTIO_BALLOON_MODULE)) && defined(CONFIG_COMPACTION)
> > +#if (defined(CONFIG_VIRTIO_BALLOON) 
> > ||defined(CONFIG_VIRTIO_BALLOON_MODULE))
> >  extern bool putback_balloon_page(struct page *);
> >  extern struct address_space *balloon_mapping;
> >  
> > @@ -1638,11 +1637,13 @@ static inline bool is_balloon_page(struct page 
> > *page)
> >  {
> > return (page->mapping && page->mapping == balloon_mapping);
> >  }
> > +#if defined(CONFIG_COMPACTION)
> > +static inline bool balloon_compaction_enabled(void) { return true; }
> >  #else
> > -static inline bool is_balloon_page(struct page *page)   { return 
> > false; }
> > -static inline bool isolate_balloon_page(struct page *page)  { return 
> > false; }
> > -static inline bool putback_balloon_page(struct page *page)  { return 
> > false; }
> > -#endif /* (VIRTIO_BALLOON || VIRTIO_BALLOON_MODULE) && COMPACTION */
> > +static inline bool putback_balloon_page(struct page *page) { return false; 
> > }
> > +static inline bool balloon_compaction_enabled(void) { return false; }
> > +#endif /* CONFIG_COMPACTION */
> > +#endif /* (CONFIG_VIRTIO_BALLOON || CONFIG_VIRTIO_BALLOON_MODULE) */
> >  
> >  #endif /* __KERNEL__ */
> >  #endif /* _LINUX_MM_H */
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 59c7bc5..f5f6a7d 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -78,7 +78,8 @@ void putback_lru_pages(struct list_head *l)
> > list_del(&page->lru);
> > dec_zone_page_state(page, NR_ISOLATED_ANON +
> > page_is_file_cache(page));
> > -   if (unlikely(is_balloon_page(page)))
> > +   if (unlikely(is_balloon_page(page)) &&
> > +   balloon_compaction_enabled())
> 
> well, that helps readability.  But what does is_balloon_page() return
> when invoked on a balloon page when CONFIG_COMPACTION=n?  False,
> methinks.
It will (now) return the right thing accordingly to the page->mapping tests.

> 
> I think the code as you previously had it was OK, but the
> is_balloon_page() name is misleading.  It really wants to be called
> is_potentially_compactible_balloon_page() :( Maybe rename it to
> compactible_balloon_page()?

With all due respect, sir, I don't believe renaming it is the right thing to do.
My major supporting reason is since Lumpy Reclaim is already evicted it looks
natural CONFIG_COMPACTION=y becoming a permanent feature, thus making that
preprocessor test useless and the renamed function signature nonsense, IMHO.
That's why I keep respectfully figthing against your argument.

Here goes another suggestion, to keep is_balloon_page() name as is. This way I
believe all concerns are potentially addressed, as there's no implicit and
misleading relationship between is_balloon_page and CONFIG_COMPACTION=y anymore,
as well as there are no potential build breakages due to (unexpected) config 
options.


diff --git a/include/linux/mm.h b/include/linux/mm.h
index b36d08c..e29ad44 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1629,5 +1629,30 @@ static inline unsigned int debug_guardpage_minorder(void)
{ return 0; }
 static inline bool page_is_guard(struct page *page) { return false; }
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
+#if (defined(CONFIG_VIRTIO_BALLOON) || defined(CONFIG_VIRTIO_BALLOON_MODULE))
+extern bool putback_balloon_page(struct page *);
+extern struct address_space *balloon_mapping;
+
+static inline bool is_balloon_page(struct page *page)
+{
+   return (page->mapping && page->mapping == balloon_mapping);
+}
+
+static inline bool balloon_compaction_enabled(void)
+{
+#if defined(CONFIG_COMPACTION)
+   return true;
+#else
+   return false;
+#endif /* CONFIG_COMPACTION */
+}
+
+#else
+static inline bool isolate_balloon_page(struct page *page) { return false; }
+static inline bool putback_balloon_page(struct page *page) { return false; }
+static inline bool is_balloon_page(struct page *page)  { return false; }
+static inline bool balloon_compaction_enabled(void){ return false; }
+#endif /* (CONFIG_VIRTIO_BALLOON || CONFIG_VIRTIO_BALLOON_MODULE) */
+
 #endif /* __KERN

Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6

2012-07-19 Thread Paolo Bonzini
Il 19/07/2012 09:28, James Bottomley ha scritto:
>> > INQUIRY responses (at least vendor/product/type) should not change.
> INQUIRY responses often change for arrays because a firmware upgrade
> enables new features and new features have to declare themselves,
> usually in the INQUIRY data.  What you mean, I think, is that previously
> exposed features in INQUIRY data, as well as strings
> (vendor/product/type, as you say), shouldn't change, but unexposed data
> (read 0 in the fields) may.

What I meant is that it's unlikely that Windows fingerprinting is using
anything but vendor/product/type, because everything else can change.

Paolo

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization