Re: [kvm-devel] [PATCH 3/3] Implement linux-aio backend

2008-04-18 Thread Marcelo Tosatti
On Thu, Apr 17, 2008 at 02:26:52PM -0500, Anthony Liguori wrote:
 This patch introduces a Linux-aio backend that is disabled by default.  To
 use this backend effectively, the user should disable caching and select
 it with the appropriate -aio option.  For instance:
 
 qemu-system-x86_64 -drive foo.img,cache=off -aio linux
 
 There's no universal way to asynchronous wait with linux-aio.  At some point,
 signals were added to signal completion.  More recently, and eventfd interface
 was added.  This patch relies on the later.
 
 We try hard to detect whether the right support is available in configure to
 avoid compile failures.

 +do {
 + err = io_submit(aio_ctxt_id, 1, iocbs);
 +} while (err == -1  errno == EINTR);
 +
 +if (err != 1) {
 + fprintf(stderr, failed to submit aio request: %m\n);
 + exit(1);
 +}
 +
 +outstanding_requests++;
 +
 +return aiocb-common;
 +}
 +
 +static void la_wait(void)
 +{
 +main_loop_wait(10);
 +}

Sleeping in the context of vcpu's is extremely bad (eg virtio-block
blocks in write() throttling which kills performance). It should wait
on IO completions instead (qemu-kvm.c creates a pthread waitqueue to
resolve that issue).

Other than that looks fine to me, will give it a try.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 3/3] Implement linux-aio backend

2008-04-18 Thread Anthony Liguori
Marcelo Tosatti wrote:
 On Thu, Apr 17, 2008 at 02:26:52PM -0500, Anthony Liguori wrote:
   
 This patch introduces a Linux-aio backend that is disabled by default.  To
 use this backend effectively, the user should disable caching and select
 it with the appropriate -aio option.  For instance:

 qemu-system-x86_64 -drive foo.img,cache=off -aio linux

 There's no universal way to asynchronous wait with linux-aio.  At some point,
 signals were added to signal completion.  More recently, and eventfd 
 interface
 was added.  This patch relies on the later.

 We try hard to detect whether the right support is available in configure to
 avoid compile failures.
 

   
 +do {
 +err = io_submit(aio_ctxt_id, 1, iocbs);
 +} while (err == -1  errno == EINTR);
 +
 +if (err != 1) {
 +fprintf(stderr, failed to submit aio request: %m\n);
 +exit(1);
 +}
 +
 +outstanding_requests++;
 +
 +return aiocb-common;
 +}
 +
 +static void la_wait(void)
 +{
 +main_loop_wait(10);
 +}
 

 Sleeping in the context of vcpu's is extremely bad (eg virtio-block
 blocks in write() throttling which kills performance). It should wait
 on IO completions instead (qemu-kvm.c creates a pthread waitqueue to
 resolve that issue).

 Other than that looks fine to me, will give it a try.
   

FWIW, I'm not getting wonderful results in KVM.  It's hard to tell 
though because time seems wildly inaccurate (even with kvm clock in the 
guest).  The time issue appears unrelated to this set of patches.

Regards,

Anthony Liguori

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 3/3] Implement linux-aio backend

2008-04-18 Thread Marcelo Tosatti
On Fri, Apr 18, 2008 at 10:18:33AM -0500, Anthony Liguori wrote:
 Sleeping in the context of vcpu's is extremely bad (eg virtio-block
 blocks in write() throttling which kills performance). It should wait
 on IO completions instead (qemu-kvm.c creates a pthread waitqueue to
 resolve that issue).
 
 Other than that looks fine to me, will give it a try.
   
 
 FWIW, I'm not getting wonderful results in KVM.  It's hard to tell 
 though because time seems wildly inaccurate (even with kvm clock in the 
 guest).  The time issue appears unrelated to this set of patches.

Oh, you won't get completion signals on the aio eventfd. You might want
to try the select-with-timeout() stuff.

Will submit that with proper signalfd emulation shortly.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 3/3] Implement linux-aio backend

2008-04-17 Thread Anthony Liguori
This patch introduces a Linux-aio backend that is disabled by default.  To
use this backend effectively, the user should disable caching and select
it with the appropriate -aio option.  For instance:

qemu-system-x86_64 -drive foo.img,cache=off -aio linux

There's no universal way to asynchronous wait with linux-aio.  At some point,
signals were added to signal completion.  More recently, and eventfd interface
was added.  This patch relies on the later.

We try hard to detect whether the right support is available in configure to
avoid compile failures.

Signed-off-by: Anthony Liguori [EMAIL PROTECTED]

diff --git a/Makefile.target b/Makefile.target
index f635d68..289887c 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -487,6 +487,9 @@ OBJS+=block-raw-win32.o
 else
 OBJS+=block-raw-posix.o aio-posix.o
 endif
+ifdef CONFIG_LINUX_AIO
+OBJS+=aio-linux.o
+endif
 
 LIBS+=-lz
 ifdef CONFIG_ALSA
diff --git a/aio-linux.c b/aio-linux.c
new file mode 100644
index 000..f5c222b
--- /dev/null
+++ b/aio-linux.c
@@ -0,0 +1,210 @@
+/*
+ * QEMU Linux AIO Support
+ *
+ * Copyright IBM, Corp. 2008
+ *
+ * Authors:
+ *  Anthony Liguori   [EMAIL PROTECTED]
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include qemu-common.h
+#include qemu-char.h
+#include block.h
+#include block_int.h
+#include block-aio.h
+#include sysemu.h
+
+#include sys/types.h
+#include sys/syscall.h
+#include linux/aio_abi.h
+
+int eventfd(unsigned int initval)
+{
+return syscall(SYS_eventfd, initval);
+}
+
+int io_setup(unsigned nr_reqs, aio_context_t *ctx_id)
+{
+return syscall(SYS_io_setup, nr_reqs, ctx_id);
+}
+
+int io_destroy(aio_context_t ctx_id)
+{
+return syscall(SYS_io_destroy, ctx_id);
+}
+
+int io_getevents(aio_context_t ctx_id, long min_nr, long nr,
+struct io_event *events, struct timespec *timeout)
+{
+return syscall(SYS_io_getevents, ctx_id, min_nr, nr, events, timeout);
+}
+
+int io_submit(aio_context_t ctx_id, long nr, struct iocb **iocb)
+{
+return syscall(SYS_io_submit, ctx_id, nr, iocb);
+}
+
+int io_cancel(aio_context_t ctx_id, struct iocb *iocb, struct io_event *result)
+{
+return syscall(SYS_io_cancel, ctx_id, iocb, result);
+}
+
+typedef struct LinuxAIOCB {
+BlockDriverAIOCB common;
+struct iocb iocb;
+} LinuxAIOCB;
+
+static int aio_efd;
+static aio_context_t aio_ctxt_id;
+static int outstanding_requests;
+
+static BlockDriverAIOCB *la_submit(BlockDriverState *bs,
+  int fd, int64_t sector_num,
+  void *buf, int nb_sectors, int write,
+  BlockDriverCompletionFunc *cb,
+  void *opaque)
+{
+LinuxAIOCB *aiocb;
+struct iocb *iocbs[1];
+int err;
+
+aiocb = qemu_aio_get(bs, cb, opaque);
+if (!aiocb) {
+   printf(returning null??\n);
+   return NULL;
+}
+
+if (write)
+   aiocb-iocb.aio_lio_opcode = IOCB_CMD_PWRITE;
+else
+   aiocb-iocb.aio_lio_opcode = IOCB_CMD_PREAD;
+
+aiocb-iocb.aio_data = (unsigned long)aiocb;
+aiocb-iocb.aio_fildes = fd;
+aiocb-iocb.aio_flags = IOCB_FLAG_RESFD;
+aiocb-iocb.aio_resfd = aio_efd;
+aiocb-iocb.aio_buf = (unsigned long)buf;
+aiocb-iocb.aio_nbytes = nb_sectors * 512;
+aiocb-iocb.aio_offset = sector_num * 512;
+
+iocbs[0] = aiocb-iocb;
+
+do {
+   err = io_submit(aio_ctxt_id, 1, iocbs);
+} while (err == -1  errno == EINTR);
+
+if (err != 1) {
+   fprintf(stderr, failed to submit aio request: %m\n);
+   exit(1);
+}
+
+outstanding_requests++;
+
+return aiocb-common;
+}
+
+static void la_wait(void)
+{
+main_loop_wait(10);
+}
+
+static void la_flush(void)
+{
+while (outstanding_requests)
+   la_wait();
+}
+
+static void la_cancel(BlockDriverAIOCB *baiocb)
+{
+LinuxAIOCB *aiocb = (void *)baiocb;
+struct io_event result;
+int err;
+
+do {
+   err = io_cancel(aio_ctxt_id, aiocb-iocb, result);
+} while (err == -1  errno == EINTR);
+
+/* it may have happened...  we probably should check and complete */
+
+outstanding_requests--;
+
+qemu_aio_release(aiocb);
+}
+
+static void la_completion(void *opaque)
+{
+struct io_event events[256];
+struct timespec ts = {0, 0};
+uint64_t count;
+int i, ret;
+
+do {
+   ret = read(aio_efd, count, sizeof(count));
+} while (ret == -1  errno == EINTR);
+
+if (ret != 8) {
+   fprintf(stderr, bad read from eventfd\n);
+   exit(1);
+}
+
+do {
+   ret = io_getevents(aio_ctxt_id, count, ARRAY_SIZE(events),
+  events, ts);
+} while (ret == -1  errno == EINTR);
+
+if (ret  count) {
+   fprintf(stderr, io_getevents failed\n);
+   exit(1);
+}
+
+for (i = 0; i  ret; i++) {
+   LinuxAIOCB *aiocb;
+   int res;
+
+   aiocb = (LinuxAIOCB