Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-25 Thread ronnie sahlberg
On Wed, May 23, 2012 at 2:31 AM, Michael Tokarev m...@tls.msk.ru wrote:
 On 22.05.2012 14:03, ronnie sahlberg wrote:
 []
 Doh, now I remember.  Whenever you change the aio handlers you need to
 call qemu_notify_event() afterwards, if the handler may fire right away.

 Thanks.  I just confirmed that qemu_notify_event() fixes the issue.
 Ill send a patch that uses qemu_notify_event() and a comment why this is 
 needed.

 When I were writing network block driver I tried to not call 
 qemu_aio_set_fd_handler()
 unless absolutely necessary -- that is, trying to write as much as possible 
 till
 the next write returns EAGAIN, or to read till the next read returns EAGAIN.  
 This
 way there's no need to call qemu_notify_event().  FWIW.


Good idea. Ill create a patch that keeps track of which events we are
listening on already and only calling these functions when the events
change.

regards
ronnie sahlberg



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-22 Thread ronnie sahlberg
Hi Stefan

On Mon, May 21, 2012 at 11:20 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Mon, May 21, 2012 at 12:00 PM, ronnie sahlberg
 ronniesahlb...@gmail.com wrote:
 Yes, I use IDE since I boot from this LUN.


 I just managed to track it down to the IDE changes.

 It looks like basically this change triggered it :

 commit bef0fd5958120542f126f2dedbfce65d8839a94d
 Author: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
 Date:   Thu Mar 29 10:31:30 2012 +0100
    ide: convert ide_sector_read() to asynchronous I/O


 Stefan, any ideas?   Ill continue trying to understand the IDE code
 tomorrow and maybe I make some progress.

 It sounds like aio for iSCSI is broken.  IDE submits bdrv_aio_readv()
 requests which iSCSI handles.  QEMU main loop should find that the
 iSCSI socket file descriptor becomes readable when the SCSI response
 arrives.  We can then report I/O completion back to the guest.

 However, it appears that completions are not triggering.  My guess is
 that once the OS is fully booted you don't see the problem because
 other events are pumping the main loop besides a 55ms timer.
 Therefore this problem probably applies across the board but it's most
 noticable when the BIOS/bootloader/early OS is executing.

I booted off a CDROM image and then had iscsi device as a disk.
From the cdrom booted guest I then did 'dd if=/dev/sda of=/dev/null
bs=512 iflag=direct'
but did not see these 55ms delays anymore.
I did however see a ~4ms delay .



 Have you added debugging into block/iscsi.c or libiscsi to watch when
 read handlers get called?  Using tcpdump or strace might also be
 useful to figure out when the reply becomes available and why we
 aren't registering a read handler function with the QEMU main loop.

I just did that, and I print a message and timeval to stdout for the
functions in block/iscsi.c
What seems to happen is that we call iscsi_aio_readv() which sets up a
write event  but this event is not triggered for another 55ms.
Since the descriptor is writeable at this point,  i would expected
that the write_event_callback would trigger almost immediately.


Below   iscsi_aio_readv   is printed when we enter this function.
This function prepares a CDB and sets it up for queueing,  but it does
not actually write it to the socket.
At the end of this function, we set up the events  by calling
iscsi_set_events() and we set it up for writeable.

iscsi_process_write() is when we enter the fd-is-writeable callback.

iscsi_aio_readv   1337676989.429822
iscsi_set_events   write_event?Y  1337676989.429834
iscsi_process_write   1337676989.484424
iscsi_set_events   write_event?N  1337676989.484507
iscsi_process_read1337676989.484628
iscsi_aio_read10_cb   1337676989.484647
iscsi_set_events   write_event?N  1337676989.484660
iscsi_readv_writev_bh_cb  1337676989.484665
iscsi_aio_readv   1337676989.484751


So  what happens seems to be that once we have set up the i/o  and
queued it,  and we set up the writeable event,
but this event is not triggered immediately! It takes another 55ms
before the event is triggered.


To me this looks like a bug in the eventsystem. The socket is always
writeable. shouldnt the callback be invoked almost immediately here ?


regards
ronnie sahlberg



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-22 Thread ronnie sahlberg
Hi,

Now that I see what happens, I can easily workaround this in block/iscsi.c
by the patch below, but I dont know if this is the right thing to do.

It does appear that here,  when I use qemu_set_fd_handler() and add a
handler for writeble it takes 55ms before the event system notices
this and reacts.



diff --git a/block/iscsi.c b/block/iscsi.c
index d37c4ee..1ebff0f 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -105,6 +105,10 @@ iscsi_set_events(IscsiLun *iscsilun)
 {
 struct iscsi_context *iscsi = iscsilun-iscsi;

+if (iscsi_which_events(iscsi)  POLLOUT) {
+iscsi_process_write(iscsilun);
+}
+
 qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), iscsi_process_read,
(iscsi_which_events(iscsi)  POLLOUT)
? iscsi_process_write : NULL,




On Tue, May 22, 2012 at 7:07 PM, ronnie sahlberg
ronniesahlb...@gmail.com wrote:
 Hi Stefan

 On Mon, May 21, 2012 at 11:20 PM, Stefan Hajnoczi stefa...@gmail.com wrote:
 On Mon, May 21, 2012 at 12:00 PM, ronnie sahlberg
 ronniesahlb...@gmail.com wrote:
 Yes, I use IDE since I boot from this LUN.


 I just managed to track it down to the IDE changes.

 It looks like basically this change triggered it :

 commit bef0fd5958120542f126f2dedbfce65d8839a94d
 Author: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
 Date:   Thu Mar 29 10:31:30 2012 +0100
    ide: convert ide_sector_read() to asynchronous I/O


 Stefan, any ideas?   Ill continue trying to understand the IDE code
 tomorrow and maybe I make some progress.

 It sounds like aio for iSCSI is broken.  IDE submits bdrv_aio_readv()
 requests which iSCSI handles.  QEMU main loop should find that the
 iSCSI socket file descriptor becomes readable when the SCSI response
 arrives.  We can then report I/O completion back to the guest.

 However, it appears that completions are not triggering.  My guess is
 that once the OS is fully booted you don't see the problem because
 other events are pumping the main loop besides a 55ms timer.
 Therefore this problem probably applies across the board but it's most
 noticable when the BIOS/bootloader/early OS is executing.

 I booted off a CDROM image and then had iscsi device as a disk.
 From the cdrom booted guest I then did 'dd if=/dev/sda of=/dev/null
 bs=512 iflag=direct'
 but did not see these 55ms delays anymore.
 I did however see a ~4ms delay .



 Have you added debugging into block/iscsi.c or libiscsi to watch when
 read handlers get called?  Using tcpdump or strace might also be
 useful to figure out when the reply becomes available and why we
 aren't registering a read handler function with the QEMU main loop.

 I just did that, and I print a message and timeval to stdout for the
 functions in block/iscsi.c
 What seems to happen is that we call iscsi_aio_readv() which sets up a
 write event  but this event is not triggered for another 55ms.
 Since the descriptor is writeable at this point,  i would expected
 that the write_event_callback would trigger almost immediately.


 Below   iscsi_aio_readv   is printed when we enter this function.
 This function prepares a CDB and sets it up for queueing,  but it does
 not actually write it to the socket.
 At the end of this function, we set up the events  by calling
 iscsi_set_events() and we set it up for writeable.

 iscsi_process_write() is when we enter the fd-is-writeable callback.

 iscsi_aio_readv                   1337676989.429822
 iscsi_set_events   write_event?Y  1337676989.429834
 iscsi_process_write               1337676989.484424
 iscsi_set_events   write_event?N  1337676989.484507
 iscsi_process_read                1337676989.484628
 iscsi_aio_read10_cb               1337676989.484647
 iscsi_set_events   write_event?N  1337676989.484660
 iscsi_readv_writev_bh_cb          1337676989.484665
 iscsi_aio_readv                   1337676989.484751


 So  what happens seems to be that once we have set up the i/o  and
 queued it,  and we set up the writeable event,
 but this event is not triggered immediately! It takes another 55ms
 before the event is triggered.


 To me this looks like a bug in the eventsystem. The socket is always
 writeable. shouldnt the callback be invoked almost immediately here ?


 regards
 ronnie sahlberg



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-22 Thread Paolo Bonzini
Il 22/05/2012 11:15, ronnie sahlberg ha scritto:
 Hi,
 
 Now that I see what happens, I can easily workaround this in block/iscsi.c
 by the patch below, but I dont know if this is the right thing to do.
 
 It does appear that here,  when I use qemu_set_fd_handler() and add a
 handler for writeble it takes 55ms before the event system notices
 this and reacts.
 
 
 
 diff --git a/block/iscsi.c b/block/iscsi.c
 index d37c4ee..1ebff0f 100644
 --- a/block/iscsi.c
 +++ b/block/iscsi.c
 @@ -105,6 +105,10 @@ iscsi_set_events(IscsiLun *iscsilun)
  {
  struct iscsi_context *iscsi = iscsilun-iscsi;
 
 +if (iscsi_which_events(iscsi)  POLLOUT) {
 +iscsi_process_write(iscsilun);
 +}
 +
  qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), iscsi_process_read,
 (iscsi_which_events(iscsi)  POLLOUT)
 ? iscsi_process_write : NULL,

Doh, now I remember.  Whenever you change the aio handlers you need to
call qemu_notify_event() afterwards, if the handler may fire right away.
 The alternative is your patch, which is correct.  Can you resend it
with SoB?

Paolo



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-22 Thread ronnie sahlberg
On Tue, May 22, 2012 at 7:29 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 22/05/2012 11:15, ronnie sahlberg ha scritto:
 Hi,

 Now that I see what happens, I can easily workaround this in block/iscsi.c
 by the patch below, but I dont know if this is the right thing to do.

 It does appear that here,  when I use qemu_set_fd_handler() and add a
 handler for writeble it takes 55ms before the event system notices
 this and reacts.



 diff --git a/block/iscsi.c b/block/iscsi.c
 index d37c4ee..1ebff0f 100644
 --- a/block/iscsi.c
 +++ b/block/iscsi.c
 @@ -105,6 +105,10 @@ iscsi_set_events(IscsiLun *iscsilun)
  {
      struct iscsi_context *iscsi = iscsilun-iscsi;

 +    if (iscsi_which_events(iscsi)  POLLOUT) {
 +        iscsi_process_write(iscsilun);
 +    }
 +
      qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), iscsi_process_read,
                             (iscsi_which_events(iscsi)  POLLOUT)
                             ? iscsi_process_write : NULL,

 Doh, now I remember.  Whenever you change the aio handlers you need to
 call qemu_notify_event() afterwards, if the handler may fire right away.

Thanks.  I just confirmed that qemu_notify_event() fixes the issue.
Ill send a patch that uses qemu_notify_event() and a comment why this is needed.


regards
ronnie sahlberg



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-22 Thread ronnie sahlberg
None of the other drivers in block/*.c call qemu_notify_event()

Do you you think those should be audited and have this call added to
where required ?


regards
ronnie sahlberg

On Tue, May 22, 2012 at 7:48 PM, ronnie sahlberg
ronniesahlb...@gmail.com wrote:
 On Tue, May 22, 2012 at 7:29 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 22/05/2012 11:15, ronnie sahlberg ha scritto:
 Hi,

 Now that I see what happens, I can easily workaround this in block/iscsi.c
 by the patch below, but I dont know if this is the right thing to do.

 It does appear that here,  when I use qemu_set_fd_handler() and add a
 handler for writeble it takes 55ms before the event system notices
 this and reacts.



 diff --git a/block/iscsi.c b/block/iscsi.c
 index d37c4ee..1ebff0f 100644
 --- a/block/iscsi.c
 +++ b/block/iscsi.c
 @@ -105,6 +105,10 @@ iscsi_set_events(IscsiLun *iscsilun)
  {
      struct iscsi_context *iscsi = iscsilun-iscsi;

 +    if (iscsi_which_events(iscsi)  POLLOUT) {
 +        iscsi_process_write(iscsilun);
 +    }
 +
      qemu_aio_set_fd_handler(iscsi_get_fd(iscsi), iscsi_process_read,
                             (iscsi_which_events(iscsi)  POLLOUT)
                             ? iscsi_process_write : NULL,

 Doh, now I remember.  Whenever you change the aio handlers you need to
 call qemu_notify_event() afterwards, if the handler may fire right away.

 Thanks.  I just confirmed that qemu_notify_event() fixes the issue.
 Ill send a patch that uses qemu_notify_event() and a comment why this is 
 needed.


 regards
 ronnie sahlberg



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-22 Thread Michael Tokarev
On 22.05.2012 14:03, ronnie sahlberg wrote:
[]
 Doh, now I remember.  Whenever you change the aio handlers you need to
 call qemu_notify_event() afterwards, if the handler may fire right away.

 Thanks.  I just confirmed that qemu_notify_event() fixes the issue.
 Ill send a patch that uses qemu_notify_event() and a comment why this is 
 needed.

When I were writing network block driver I tried to not call 
qemu_aio_set_fd_handler()
unless absolutely necessary -- that is, trying to write as much as possible till
the next write returns EAGAIN, or to read till the next read returns EAGAIN.  
This
way there's no need to call qemu_notify_event().  FWIW.

/mjt



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-21 Thread Kevin Wolf
Am 21.05.2012 08:07, schrieb ronnie sahlberg:
 List, Kevin,
 
 Since this merge :
 commit 1f8bcac09af61e58c5121aa0a932190700ad554d
 Merge: cb4c254 1042ec9
 Author: Anthony Liguori aligu...@us.ibm.com
 Date:   Mon Apr 23 14:27:04 2012 -0500
 
 Merge remote-tracking branch 'kwolf/for-anthony' into staging
 
 * kwolf/for-anthony: (38 commits)
   qemu-iotests: Fix test 031 for qcow2 v3 support
   qemu-iotests: Add -o and make v3 the default for qcow2
   qcow2: Zero write support
   qemu-iotests: Test backing file COW with zero clusters
   qemu-iotests: add a simple test for write_zeroes
   qcow2: Support for feature table header extension
   qcow2: Support reading zero clusters
   qcow2: Version 3 images
   qcow2: Ignore reserved bits in check_refcounts
   qcow2: Ignore reserved bits in refcount table entries
   qcow2: Simplify count_cow_clusters
   qcow2: Refactor qcow2_free_any_clusters
   qcow2: Ignore reserved bits in L1/L2 entries
   qcow2: Fail write_compressed when overwriting data
   qcow2: Ignore reserved bits in count_contiguous_clusters()
   qcow2: Ignore reserved bits in get_cluster_offset
   qcow2: Save disk size in snapshot header
   Specification for qcow2 version 3
   qcow2: Fix refcount block allocation during qcow2_alloc_cluster_at()
   iotests: Resolve test failures caused by hostname
   ...
 
 I am seeing some weirdness when using iscsi.
 I have isolated it to this particular commit, but since it is 3900
 lines in sinze I have not yet found the exact change that triggers
 this particular behaviour.
 
 It shows up when using an iscsi device to boot from, where when during
 the bios boot and later grub boot almost all I/O has a pause of 55ms
 between them.
 
 During boot the bios and later grub will read a lot of data, primarily
 sequentially and one single block at a time.
 After these changes were applied there is now an approximate 55ms
 delay between all these I/O, causing the boot process to become very
 slow.
 
 
 I have not yet found the exact part of this big patch that cause this
 slowdown, but will continue investigating.
 
 I am posting this here in case someone has any idea  or if 55ms rings any 
 bells.

The only thing in this merge that looks as if it could be related is
Paolo's AIO changes between commit adfe92f6 and 9eb0bfca.

Or if you use IDE, Stefan's patches to make PIO operations asynchronous
could be related.

Kevin



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-21 Thread Paolo Bonzini
Il 21/05/2012 12:45, Kevin Wolf ha scritto:
 I have not yet found the exact part of this big patch that cause this
 slowdown, but will continue investigating.

 I am posting this here in case someone has any idea  or if 55ms rings any 
 bells.

55ms is the frequency of the PIT while the BIOS is running.  I'd guess
something is not waking up the iothread properly.

 The only thing in this merge that looks as if it could be related is
 Paolo's AIO changes between commit adfe92f6 and 9eb0bfca.
 
 Or if you use IDE, Stefan's patches to make PIO operations asynchronous
 could be related.

Yeah, both seem just as likely to be the culprit.  Please bisect.

Paolo



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-21 Thread ronnie sahlberg
Yes, I use IDE since I boot from this LUN.


I just managed to track it down to the IDE changes.

It looks like basically this change triggered it :

commit bef0fd5958120542f126f2dedbfce65d8839a94d
Author: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Date:   Thu Mar 29 10:31:30 2012 +0100
ide: convert ide_sector_read() to asynchronous I/O


Stefan, any ideas?   Ill continue trying to understand the IDE code
tomorrow and maybe I make some progress.

regards
ronnie sahlberg



On Mon, May 21, 2012 at 8:45 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 21.05.2012 08:07, schrieb ronnie sahlberg:
 List, Kevin,

 Since this merge :
 commit 1f8bcac09af61e58c5121aa0a932190700ad554d
 Merge: cb4c254 1042ec9
 Author: Anthony Liguori aligu...@us.ibm.com
 Date:   Mon Apr 23 14:27:04 2012 -0500

     Merge remote-tracking branch 'kwolf/for-anthony' into staging

     * kwolf/for-anthony: (38 commits)
       qemu-iotests: Fix test 031 for qcow2 v3 support
       qemu-iotests: Add -o and make v3 the default for qcow2
       qcow2: Zero write support
       qemu-iotests: Test backing file COW with zero clusters
       qemu-iotests: add a simple test for write_zeroes
       qcow2: Support for feature table header extension
       qcow2: Support reading zero clusters
       qcow2: Version 3 images
       qcow2: Ignore reserved bits in check_refcounts
       qcow2: Ignore reserved bits in refcount table entries
       qcow2: Simplify count_cow_clusters
       qcow2: Refactor qcow2_free_any_clusters
       qcow2: Ignore reserved bits in L1/L2 entries
       qcow2: Fail write_compressed when overwriting data
       qcow2: Ignore reserved bits in count_contiguous_clusters()
       qcow2: Ignore reserved bits in get_cluster_offset
       qcow2: Save disk size in snapshot header
       Specification for qcow2 version 3
       qcow2: Fix refcount block allocation during qcow2_alloc_cluster_at()
       iotests: Resolve test failures caused by hostname
       ...

 I am seeing some weirdness when using iscsi.
 I have isolated it to this particular commit, but since it is 3900
 lines in sinze I have not yet found the exact change that triggers
 this particular behaviour.

 It shows up when using an iscsi device to boot from, where when during
 the bios boot and later grub boot almost all I/O has a pause of 55ms
 between them.

 During boot the bios and later grub will read a lot of data, primarily
 sequentially and one single block at a time.
 After these changes were applied there is now an approximate 55ms
 delay between all these I/O, causing the boot process to become very
 slow.


 I have not yet found the exact part of this big patch that cause this
 slowdown, but will continue investigating.

 I am posting this here in case someone has any idea  or if 55ms rings any 
 bells.

 The only thing in this merge that looks as if it could be related is
 Paolo's AIO changes between commit adfe92f6 and 9eb0bfca.

 Or if you use IDE, Stefan's patches to make PIO operations asynchronous
 could be related.

 Kevin



Re: [Qemu-devel] Weird iscsi/fd-event issue since recent merge of event system changes

2012-05-21 Thread Stefan Hajnoczi
On Mon, May 21, 2012 at 12:00 PM, ronnie sahlberg
ronniesahlb...@gmail.com wrote:
 Yes, I use IDE since I boot from this LUN.


 I just managed to track it down to the IDE changes.

 It looks like basically this change triggered it :

 commit bef0fd5958120542f126f2dedbfce65d8839a94d
 Author: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
 Date:   Thu Mar 29 10:31:30 2012 +0100
    ide: convert ide_sector_read() to asynchronous I/O


 Stefan, any ideas?   Ill continue trying to understand the IDE code
 tomorrow and maybe I make some progress.

It sounds like aio for iSCSI is broken.  IDE submits bdrv_aio_readv()
requests which iSCSI handles.  QEMU main loop should find that the
iSCSI socket file descriptor becomes readable when the SCSI response
arrives.  We can then report I/O completion back to the guest.

However, it appears that completions are not triggering.  My guess is
that once the OS is fully booted you don't see the problem because
other events are pumping the main loop besides a 55ms timer.
Therefore this problem probably applies across the board but it's most
noticable when the BIOS/bootloader/early OS is executing.

Have you added debugging into block/iscsi.c or libiscsi to watch when
read handlers get called?  Using tcpdump or strace might also be
useful to figure out when the reply becomes available and why we
aren't registering a read handler function with the QEMU main loop.

Stefan