Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
Asias He writes: > vhost-blk is a in kernel virito-blk device accelerator. > > This patch is based on Liu Yuan's implementation with various > improvements and bug fixes. Notably, this patch makes guest notify and > host completion processing in parallel which gives about 60% performance > improvement compared to Liu Yuan's implementation. > > Performance evaluation: > - > The comparison is between kvm tool with usersapce implementation and kvm > tool with vhost-blk. > > 1) Fio with libaio ioengine on Fusion IO device > With bio-based IO path, sequential read/write, random read/write > IOPS boost : 8.4%, 15.3%, 10.4%, 14.6% > Latency improvement: 8.5%, 15.4%, 10.4%, 15.1% > > 2) Fio with vsync ioengine on Fusion IO device > With bio-based IO path, sequential read/write, random read/write > IOPS boost : 10.5%, 4.8%, 5.2%, 5.6% > Latency improvement: 11.4%, 5.0%, 5.2%, 5.8% > > Cc: Michael S. Tsirkin > Cc: linux-ker...@vger.kernel.org > Cc: k...@vger.kernel.org > Cc: virtualization@lists.linux-foundation.org > Signed-off-by: Asias He > --- > drivers/vhost/Kconfig | 10 + > drivers/vhost/Makefile |2 + > drivers/vhost/blk.c| 600 > > drivers/vhost/vhost.h |5 + > include/linux/vhost.h |3 + > 5 files changed, 620 insertions(+) > create mode 100644 drivers/vhost/blk.c > > diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig > index c387067..fa071a8 100644 > --- a/drivers/vhost/Kconfig > +++ b/drivers/vhost/Kconfig > @@ -16,4 +16,14 @@ config VHOST_NET > > To compile this driver as a module, choose M here: the module will > be called vhost_net. > +config VHOST_BLK > + tristate "Host kernel accelerator for virtio blk (EXPERIMENTAL)" > + depends on VHOST && BLOCK && AIO && EVENTFD && EXPERIMENTAL > + ---help--- > + This kernel module can be loaded in host kernel to accelerate > + guest block with virtio_blk. Not to be confused with virtio_blk > + module itself which needs to be loaded in guest kernel. > + > + To compile this driver as a module, choose M here: the module will > + be called vhost_blk. > > diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile > index cd36885..aa461d5 100644 > --- a/drivers/vhost/Makefile > +++ b/drivers/vhost/Makefile > @@ -1,4 +1,6 @@ > obj-$(CONFIG_VHOST) += vhost.o > obj-$(CONFIG_VHOST_NET) += vhost_net.o > +obj-$(CONFIG_VHOST_BLK) += vhost_blk.o > > vhost_net-y := net.o > +vhost_blk-y := blk.o > diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c > new file mode 100644 > index 000..6a94894 > --- /dev/null > +++ b/drivers/vhost/blk.c > @@ -0,0 +1,600 @@ > +/* > + * Copyright (C) 2011 Taobao, Inc. > + * Author: Liu Yuan > + * > + * Copyright (C) 2012 Red Hat, Inc. > + * Author: Asias He > + * > + * This work is licensed under the terms of the GNU GPL, version 2. > + * > + * virtio-blk server in host kernel. > + */ > + > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > +#include > + > +#include "vhost.h" > + > +#define BLK_HDR 0 > + > +enum { > + VHOST_BLK_VQ_REQ = 0, > + VHOST_BLK_VQ_MAX = 1, > +}; > + > +struct vhost_blk_req { > + u16 head; > + u8 *status; > +}; > + > +struct vhost_blk { > + struct task_struct *worker_host_kick; > + struct task_struct *worker; > + struct vhost_blk_req *reqs; > + struct vhost_virtqueue vq; > + struct eventfd_ctx *ectx; > + struct io_event *ioevent; > + struct kioctx *ioctx; > + struct vhost_dev dev; > + struct file *efile; > + u64 ioevent_nr; > + bool stop; > +}; > + > +static inline int vhost_blk_read_events(struct vhost_blk *blk, long nr) > +{ > + mm_segment_t old_fs = get_fs(); > + int ret; > + > + set_fs(KERNEL_DS); > + ret = read_events(blk->ioctx, nr, nr, blk->ioevent, NULL); > + set_fs(old_fs); > + > + return ret; > +} > + > +static int vhost_blk_setup(struct vhost_blk *blk) > +{ > + struct kioctx *ctx; > + > + if (blk->ioctx) > + return 0; > + > + blk->ioevent_nr = blk->vq.num; > + ctx = ioctx_alloc(blk->ioevent_nr); > + if (IS_ERR(ctx)) { > + pr_err("Failed to ioctx_alloc"); > + return PTR_ERR(ctx); > + } Not that it's very likely that ioctx_alloc will fail in practice. There's a fixed number of events that can be allocated that's currently 0x1. If you have a ring queue size of 1024 (which is normal) then that limits you to 64 vhost-blk devices. Realistically, I don't think you can only do aio with vhost-blk because of this (and many other) limitations. It's necessary to be able to fall back to a thread pool because AIO cannot be relied upon. > + put_ioctx(ctx); > + blk->ioctx = ctx; > + > + blk->ioevent = kmalloc(sizeof(struct io_event) * blk->ioevent_nr, > +
Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
On Thu, Jul 19, 2012 at 08:05:42AM -0500, Anthony Liguori wrote: > Of course, the million dollar question is why would using AIO in the > kernel be faster than using AIO in userspace? Actually for me a more important question is how does it compare with virtio-blk dataplane? -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH RESEND 5/5] vhost-blk: Add vhost-blk support
On Thu, Jul 19, 2012 at 08:05:42AM -0500, Anthony Liguori wrote: > Asias He writes: > > > vhost-blk is a in kernel virito-blk device accelerator. > > > > This patch is based on Liu Yuan's implementation with various > > improvements and bug fixes. Notably, this patch makes guest notify and > > host completion processing in parallel which gives about 60% performance > > improvement compared to Liu Yuan's implementation. > > > > Performance evaluation: > > - > > The comparison is between kvm tool with usersapce implementation and kvm > > tool with vhost-blk. > > > > 1) Fio with libaio ioengine on Fusion IO device > > With bio-based IO path, sequential read/write, random read/write > > IOPS boost : 8.4%, 15.3%, 10.4%, 14.6% > > Latency improvement: 8.5%, 15.4%, 10.4%, 15.1% > > > > 2) Fio with vsync ioengine on Fusion IO device > > With bio-based IO path, sequential read/write, random read/write > > IOPS boost : 10.5%, 4.8%, 5.2%, 5.6% > > Latency improvement: 11.4%, 5.0%, 5.2%, 5.8% > > > > Cc: Michael S. Tsirkin > > Cc: linux-ker...@vger.kernel.org > > Cc: k...@vger.kernel.org > > Cc: virtualization@lists.linux-foundation.org > > Signed-off-by: Asias He > > --- > > drivers/vhost/Kconfig | 10 + > > drivers/vhost/Makefile |2 + > > drivers/vhost/blk.c| 600 > > > > drivers/vhost/vhost.h |5 + > > include/linux/vhost.h |3 + > > 5 files changed, 620 insertions(+) > > create mode 100644 drivers/vhost/blk.c > > > > diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig > > index c387067..fa071a8 100644 > > --- a/drivers/vhost/Kconfig > > +++ b/drivers/vhost/Kconfig > > @@ -16,4 +16,14 @@ config VHOST_NET > > > > To compile this driver as a module, choose M here: the module will > > be called vhost_net. > > +config VHOST_BLK > > + tristate "Host kernel accelerator for virtio blk (EXPERIMENTAL)" > > + depends on VHOST && BLOCK && AIO && EVENTFD && EXPERIMENTAL > > + ---help--- > > + This kernel module can be loaded in host kernel to accelerate > > + guest block with virtio_blk. Not to be confused with virtio_blk > > + module itself which needs to be loaded in guest kernel. > > + > > + To compile this driver as a module, choose M here: the module will > > + be called vhost_blk. > > > > diff --git a/drivers/vhost/Makefile b/drivers/vhost/Makefile > > index cd36885..aa461d5 100644 > > --- a/drivers/vhost/Makefile > > +++ b/drivers/vhost/Makefile > > @@ -1,4 +1,6 @@ > > obj-$(CONFIG_VHOST)+= vhost.o > > obj-$(CONFIG_VHOST_NET) += vhost_net.o > > +obj-$(CONFIG_VHOST_BLK) += vhost_blk.o > > > > vhost_net-y:= net.o > > +vhost_blk-y:= blk.o > > diff --git a/drivers/vhost/blk.c b/drivers/vhost/blk.c > > new file mode 100644 > > index 000..6a94894 > > --- /dev/null > > +++ b/drivers/vhost/blk.c > > @@ -0,0 +1,600 @@ > > +/* > > + * Copyright (C) 2011 Taobao, Inc. > > + * Author: Liu Yuan > > + * > > + * Copyright (C) 2012 Red Hat, Inc. > > + * Author: Asias He > > + * > > + * This work is licensed under the terms of the GNU GPL, version 2. > > + * > > + * virtio-blk server in host kernel. > > + */ > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#include "vhost.h" > > + > > +#define BLK_HDR0 > > + > > +enum { > > + VHOST_BLK_VQ_REQ = 0, > > + VHOST_BLK_VQ_MAX = 1, > > +}; > > + > > +struct vhost_blk_req { > > + u16 head; > > + u8 *status; > > +}; > > + > > +struct vhost_blk { > > + struct task_struct *worker_host_kick; > > + struct task_struct *worker; > > + struct vhost_blk_req *reqs; > > + struct vhost_virtqueue vq; > > + struct eventfd_ctx *ectx; > > + struct io_event *ioevent; > > + struct kioctx *ioctx; > > + struct vhost_dev dev; > > + struct file *efile; > > + u64 ioevent_nr; > > + bool stop; > > +}; > > + > > +static inline int vhost_blk_read_events(struct vhost_blk *blk, long nr) > > +{ > > + mm_segment_t old_fs = get_fs(); > > + int ret; > > + > > + set_fs(KERNEL_DS); > > + ret = read_events(blk->ioctx, nr, nr, blk->ioevent, NULL); > > + set_fs(old_fs); > > + > > + return ret; > > +} > > + > > +static int vhost_blk_setup(struct vhost_blk *blk) > > +{ > > + struct kioctx *ctx; > > + > > + if (blk->ioctx) > > + return 0; > > + > > + blk->ioevent_nr = blk->vq.num; > > + ctx = ioctx_alloc(blk->ioevent_nr); > > + if (IS_ERR(ctx)) { > > + pr_err("Failed to ioctx_alloc"); > > + return PTR_ERR(ctx); > > + } > > Not that it's very likely that ioctx_alloc will fail in practice. > There's a fixed number of events that can be allocated that's currently > 0x1. If you have a ring queue size of 1024 (which is normal) then > that limits you to 64 vhost-blk devices. > > Realistically, I don'
Re: [PATCH v4 1/3] mm: introduce compaction and migration for virtio ballooned pages
On Wed, Jul 18, 2012 at 06:29:44PM -0700, Andrew Morton wrote: > On Wed, 18 Jul 2012 22:00:48 -0300 Rafael Aquini wrote: > > > > So the function needs a better name - one which communicates that it is > > > a balloon page *for the purposes of processing by the compaction code*. > > > Making the function private to compaction.c would help with that, if > > > feasible. > > > > > > > How about this (adjusted) approach: > > it fails checkpatch ;) > Ugh! it fails due to a lacking whitespace... will fix that right away. > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -1629,8 +1629,7 @@ static inline unsigned int > > debug_guardpage_minorder(void) > > { return 0; } > > static inline bool page_is_guard(struct page *page) { return false; } > > #endif /* CONFIG_DEBUG_PAGEALLOC */ > > > > -#if (defined(CONFIG_VIRTIO_BALLOON) || \ > > - defined(CONFIG_VIRTIO_BALLOON_MODULE)) && defined(CONFIG_COMPACTION) > > +#if (defined(CONFIG_VIRTIO_BALLOON) > > ||defined(CONFIG_VIRTIO_BALLOON_MODULE)) > > extern bool putback_balloon_page(struct page *); > > extern struct address_space *balloon_mapping; > > > > @@ -1638,11 +1637,13 @@ static inline bool is_balloon_page(struct page > > *page) > > { > > return (page->mapping && page->mapping == balloon_mapping); > > } > > +#if defined(CONFIG_COMPACTION) > > +static inline bool balloon_compaction_enabled(void) { return true; } > > #else > > -static inline bool is_balloon_page(struct page *page) { return > > false; } > > -static inline bool isolate_balloon_page(struct page *page) { return > > false; } > > -static inline bool putback_balloon_page(struct page *page) { return > > false; } > > -#endif /* (VIRTIO_BALLOON || VIRTIO_BALLOON_MODULE) && COMPACTION */ > > +static inline bool putback_balloon_page(struct page *page) { return false; > > } > > +static inline bool balloon_compaction_enabled(void) { return false; } > > +#endif /* CONFIG_COMPACTION */ > > +#endif /* (CONFIG_VIRTIO_BALLOON || CONFIG_VIRTIO_BALLOON_MODULE) */ > > > > #endif /* __KERNEL__ */ > > #endif /* _LINUX_MM_H */ > > diff --git a/mm/migrate.c b/mm/migrate.c > > index 59c7bc5..f5f6a7d 100644 > > --- a/mm/migrate.c > > +++ b/mm/migrate.c > > @@ -78,7 +78,8 @@ void putback_lru_pages(struct list_head *l) > > list_del(&page->lru); > > dec_zone_page_state(page, NR_ISOLATED_ANON + > > page_is_file_cache(page)); > > - if (unlikely(is_balloon_page(page))) > > + if (unlikely(is_balloon_page(page)) && > > + balloon_compaction_enabled()) > > well, that helps readability. But what does is_balloon_page() return > when invoked on a balloon page when CONFIG_COMPACTION=n? False, > methinks. It will (now) return the right thing accordingly to the page->mapping tests. > > I think the code as you previously had it was OK, but the > is_balloon_page() name is misleading. It really wants to be called > is_potentially_compactible_balloon_page() :( Maybe rename it to > compactible_balloon_page()? With all due respect, sir, I don't believe renaming it is the right thing to do. My major supporting reason is since Lumpy Reclaim is already evicted it looks natural CONFIG_COMPACTION=y becoming a permanent feature, thus making that preprocessor test useless and the renamed function signature nonsense, IMHO. That's why I keep respectfully figthing against your argument. Here goes another suggestion, to keep is_balloon_page() name as is. This way I believe all concerns are potentially addressed, as there's no implicit and misleading relationship between is_balloon_page and CONFIG_COMPACTION=y anymore, as well as there are no potential build breakages due to (unexpected) config options. diff --git a/include/linux/mm.h b/include/linux/mm.h index b36d08c..e29ad44 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1629,5 +1629,30 @@ static inline unsigned int debug_guardpage_minorder(void) { return 0; } static inline bool page_is_guard(struct page *page) { return false; } #endif /* CONFIG_DEBUG_PAGEALLOC */ +#if (defined(CONFIG_VIRTIO_BALLOON) || defined(CONFIG_VIRTIO_BALLOON_MODULE)) +extern bool putback_balloon_page(struct page *); +extern struct address_space *balloon_mapping; + +static inline bool is_balloon_page(struct page *page) +{ + return (page->mapping && page->mapping == balloon_mapping); +} + +static inline bool balloon_compaction_enabled(void) +{ +#if defined(CONFIG_COMPACTION) + return true; +#else + return false; +#endif /* CONFIG_COMPACTION */ +} + +#else +static inline bool isolate_balloon_page(struct page *page) { return false; } +static inline bool putback_balloon_page(struct page *page) { return false; } +static inline bool is_balloon_page(struct page *page) { return false; } +static inline bool balloon_compaction_enabled(void){ return false; } +#endif /* (CONFIG_VIRTIO_BALLOON || CONFIG_VIRTIO_BALLOON_MODULE) */ + #endif /* __KERN
Re: [RFC-v2 0/4] tcm_vhost+cmwq fabric driver code for-3.6
Il 19/07/2012 09:28, James Bottomley ha scritto: >> > INQUIRY responses (at least vendor/product/type) should not change. > INQUIRY responses often change for arrays because a firmware upgrade > enables new features and new features have to declare themselves, > usually in the INQUIRY data. What you mean, I think, is that previously > exposed features in INQUIRY data, as well as strings > (vendor/product/type, as you say), shouldn't change, but unexposed data > (read 0 in the fields) may. What I meant is that it's unlikely that Windows fingerprinting is using anything but vendor/product/type, because everything else can change. Paolo ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization