Re: [PATCH v5 0/7] vhost-user-blk: fix the migration issue and enhance qtests

2020-10-07 Thread Dima Stepanov
On Tue, Sep 29, 2020 at 12:48:38PM +0300, Dima Stepanov wrote:
> On Tue, Sep 29, 2020 at 03:13:09AM -0400, Michael S. Tsirkin wrote:
> > On Sun, Sep 27, 2020 at 09:48:28AM +0300, Dima Stepanov wrote:
> > > On Thu, Sep 24, 2020 at 07:26:14AM -0400, Michael S. Tsirkin wrote:
> > > > On Fri, Sep 11, 2020 at 11:39:42AM +0300, Dima Stepanov wrote:
> > > > > v4 -> v5:
> > > > >   - vhost: check queue state in the vhost_dev_set_log routine
> > > > > tests/qtest/vhost-user-test: prepare the tests for adding new
> > > > > dev class
> > > > > tests/qtest/vhost-user-test: add support for the
> > > > > vhost-user-blk device
> > > > > tests/qtest/vhost-user-test: add migrate_reconnect test
> > > > > Reviewed-by: Raphael Norwitz
> > > > >   - Update qtest, by merging vhost-user-blk "if" case with the
> > > > > virtio-blk case.
> > > > 
> > > > I dropped patches 3-7 since they were stalling on some systems.
> > > > Pls work with Peter Maydell (cc'd) to figure it out.
> > > Thanks!
> > > 
> > > Peter, can you share any details for the stalling errors with me?
> > 
> > I can say for sure that even on x86/linux the affected tests take
> > much longer to run with these applied.
> > I'd suggest making sure there are no timeouts involved in the good case 
> Could you help me to reproduce it? Because on my system i see only 10+ seconds
> increase for the qos-test set to pass (both x86_64 and i386). So on the
> current master i'm running it like:
>   - ./configure  --target-list="x86_64-softmmu i386-softmmu"
>   - with no patch set:
> time QTEST_QEMU_BINARY=./build/x86_64-softmmu/qemu-system-x86_64 
> ./build/tests/qtest/qos-test
> real0m6.394s
> user0m3.643s
> sys 0m3.477s
>   - without patch 7 (where i include vhost-user-net tests also):
> real0m9.955s
> user0m4.133s
> sys 0m4.397s
>   - full patch set:
> real0m17.263s
> user0m4.530s
> sys 0m4.802s
> For i386 target i see pretty the same numbers:
>   time QTEST_QEMU_BINARY=./build/i386-softmmu/qemu-system-i386 
> ./build/tests/qtest/qos-test
>   real0m17.386s
>   user0m4.503s
>   sys 0m4.911s
> So it looks like that i'm missing some step to reproduce an issue.
> 
> And if i run the exact test it takes ~2-3s to pass:
>   $ time QTEST_QEMU_BINARY=./build/x86_64-softmmu/qemu-system-x86_64 
> ./build/tests/qtest/qos-test -p 
> /x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/vhost-user-blk-pci/vhost-user-blk/vhost-user-blk-tests/reconnect
>   
> /x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/vhost-user-blk-pci/vhost-user-blk/vhost-user-blk-tests/reconnect:
>  OK
>   real0m2.253s
>   user0m0.118s
>   sys 0m0.104s
> And same numbers for i386.
> 
Sorry, but still fail to reproduce the stall issue on both i386 and
x86_64 targets. Any help is highly appreciated.

Thanks, Dima.

> > 
> > > > 
> > > > 
> > > > > v3 -> v4:
> > > > >   - vhost: recheck dev state in the vhost_migration_log routine
> > > > > Reviewed-by: Raphael Norwitz
> > > > >   - vhost: check queue state in the vhost_dev_set_log routine
> > > > > Use "continue" instead of "break" to handle non-initialized
> > > > > virtqueue case.
> > > > > 
> > > > > v2 -> v3:
> > > > >   - update commit message for the 
> > > > > "vhost: recheck dev state in the vhost_migration_log routine" 
> > > > > commit
> > > > >   - rename "started" field of the VhostUserBlk structure to
> > > > > "started_vu", so there will be no confustion with the VHOST 
> > > > > started
> > > > > field
> > > > >   - update vhost-user-test.c to always initialize nq local variable
> > > > > (spotted by patchew)
> > > > > 
> > > > > v1 -> v2:
> > > > >   - add comments to connected/started fields in the header file
> > > > >   - move the "s->started" logic from the vhost_user_blk_disconnect
> > > > > routine to the vhost_user_blk_stop routine
> > > > > 
> > > > > Reference e-mail threads:
> > > > >   - 
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg01509.html
> > > > >   - 
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg05241.html
> > > > > 
> > > > > If vhost-user daemon is used as a backend for the vhost device, then 
> > > > > we
> > > > > should consider a possibility of disconnect at any moment. There was 
> > > > > a general
> > > > > question here: should we consider it as an error or okay state for 
> > > > > the vhost-user
> > > > > devices during migration process?
> > > > > I think the disconnect event for the vhost-user devices should not 
> > > > > break the
> > > > > migration process, because:
> > > > >   - the device will be in the stopped state, so it will not be changed
> > > > > during migration
> > > > >   - if reconnect will be made the migration log will be reinitialized 
> > > > > as
> > > > > part of reconnect/init process:
> > > > > #0  vhost_log_global_start (listener=0x563989cf7be0)
> > > > > at hw/virtio/vhost.c:920
> > > 

Re: [PATCH v5 0/7] vhost-user-blk: fix the migration issue and enhance qtests

2020-09-29 Thread Dima Stepanov
On Tue, Sep 29, 2020 at 03:13:09AM -0400, Michael S. Tsirkin wrote:
> On Sun, Sep 27, 2020 at 09:48:28AM +0300, Dima Stepanov wrote:
> > On Thu, Sep 24, 2020 at 07:26:14AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Sep 11, 2020 at 11:39:42AM +0300, Dima Stepanov wrote:
> > > > v4 -> v5:
> > > >   - vhost: check queue state in the vhost_dev_set_log routine
> > > > tests/qtest/vhost-user-test: prepare the tests for adding new
> > > > dev class
> > > > tests/qtest/vhost-user-test: add support for the
> > > > vhost-user-blk device
> > > > tests/qtest/vhost-user-test: add migrate_reconnect test
> > > > Reviewed-by: Raphael Norwitz
> > > >   - Update qtest, by merging vhost-user-blk "if" case with the
> > > > virtio-blk case.
> > > 
> > > I dropped patches 3-7 since they were stalling on some systems.
> > > Pls work with Peter Maydell (cc'd) to figure it out.
> > Thanks!
> > 
> > Peter, can you share any details for the stalling errors with me?
> 
> I can say for sure that even on x86/linux the affected tests take
> much longer to run with these applied.
> I'd suggest making sure there are no timeouts involved in the good case 
Could you help me to reproduce it? Because on my system i see only 10+ seconds
increase for the qos-test set to pass (both x86_64 and i386). So on the
current master i'm running it like:
  - ./configure  --target-list="x86_64-softmmu i386-softmmu"
  - with no patch set:
time QTEST_QEMU_BINARY=./build/x86_64-softmmu/qemu-system-x86_64 
./build/tests/qtest/qos-test
real0m6.394s
user0m3.643s
sys 0m3.477s
  - without patch 7 (where i include vhost-user-net tests also):
real0m9.955s
user0m4.133s
sys 0m4.397s
  - full patch set:
real0m17.263s
user0m4.530s
sys 0m4.802s
For i386 target i see pretty the same numbers:
  time QTEST_QEMU_BINARY=./build/i386-softmmu/qemu-system-i386 
./build/tests/qtest/qos-test
  real0m17.386s
  user0m4.503s
  sys 0m4.911s
So it looks like that i'm missing some step to reproduce an issue.

And if i run the exact test it takes ~2-3s to pass:
  $ time QTEST_QEMU_BINARY=./build/x86_64-softmmu/qemu-system-x86_64 
./build/tests/qtest/qos-test -p 
/x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/vhost-user-blk-pci/vhost-user-blk/vhost-user-blk-tests/reconnect
  
/x86_64/pc/i440FX-pcihost/pci-bus-pc/pci-bus/vhost-user-blk-pci/vhost-user-blk/vhost-user-blk-tests/reconnect:
 OK
  real0m2.253s
  user0m0.118s
  sys 0m0.104s
And same numbers for i386.

> 
> > > 
> > > 
> > > > v3 -> v4:
> > > >   - vhost: recheck dev state in the vhost_migration_log routine
> > > > Reviewed-by: Raphael Norwitz
> > > >   - vhost: check queue state in the vhost_dev_set_log routine
> > > > Use "continue" instead of "break" to handle non-initialized
> > > > virtqueue case.
> > > > 
> > > > v2 -> v3:
> > > >   - update commit message for the 
> > > > "vhost: recheck dev state in the vhost_migration_log routine" commit
> > > >   - rename "started" field of the VhostUserBlk structure to
> > > > "started_vu", so there will be no confustion with the VHOST started
> > > > field
> > > >   - update vhost-user-test.c to always initialize nq local variable
> > > > (spotted by patchew)
> > > > 
> > > > v1 -> v2:
> > > >   - add comments to connected/started fields in the header file
> > > >   - move the "s->started" logic from the vhost_user_blk_disconnect
> > > > routine to the vhost_user_blk_stop routine
> > > > 
> > > > Reference e-mail threads:
> > > >   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg01509.html
> > > >   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg05241.html
> > > > 
> > > > If vhost-user daemon is used as a backend for the vhost device, then we
> > > > should consider a possibility of disconnect at any moment. There was a 
> > > > general
> > > > question here: should we consider it as an error or okay state for the 
> > > > vhost-user
> > > > devices during migration process?
> > > > I think the disconnect event for the vhost-user devices should not 
> > > > break the
> > > > migration process, because:
> > > >   - the device will be in the stopped state, so it will not be changed
> > > > during migration
> > > >   - if reconnect will be made the migration log will be reinitialized as
> > > > part of reconnect/init process:
> > > > #0  vhost_log_global_start (listener=0x563989cf7be0)
> > > > at hw/virtio/vhost.c:920
> > > > #1  0x56398603d8bc in listener_add_address_space 
> > > > (listener=0x563989cf7be0,
> > > > as=0x563986ea4340 )
> > > > at softmmu/memory.c:2664
> > > > #2  0x56398603dd30 in memory_listener_register 
> > > > (listener=0x563989cf7be0,
> > > > as=0x563986ea4340 )
> > > > at softmmu/memory.c:2740
> > > > #3  0x563985fd6956 in vhost_dev_init (hdev=0x563989cf7bd8,
> > > > opaque=0x563989cf7e30, 

Re: [PATCH v5 0/7] vhost-user-blk: fix the migration issue and enhance qtests

2020-09-29 Thread Michael S. Tsirkin
On Sun, Sep 27, 2020 at 09:48:28AM +0300, Dima Stepanov wrote:
> On Thu, Sep 24, 2020 at 07:26:14AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Sep 11, 2020 at 11:39:42AM +0300, Dima Stepanov wrote:
> > > v4 -> v5:
> > >   - vhost: check queue state in the vhost_dev_set_log routine
> > > tests/qtest/vhost-user-test: prepare the tests for adding new
> > > dev class
> > > tests/qtest/vhost-user-test: add support for the
> > > vhost-user-blk device
> > > tests/qtest/vhost-user-test: add migrate_reconnect test
> > > Reviewed-by: Raphael Norwitz
> > >   - Update qtest, by merging vhost-user-blk "if" case with the
> > > virtio-blk case.
> > 
> > I dropped patches 3-7 since they were stalling on some systems.
> > Pls work with Peter Maydell (cc'd) to figure it out.
> Thanks!
> 
> Peter, can you share any details for the stalling errors with me?

I can say for sure that even on x86/linux the affected tests take
much longer to run with these applied.
I'd suggest making sure there are no timeouts involved in the good case 

> > 
> > 
> > > v3 -> v4:
> > >   - vhost: recheck dev state in the vhost_migration_log routine
> > > Reviewed-by: Raphael Norwitz
> > >   - vhost: check queue state in the vhost_dev_set_log routine
> > > Use "continue" instead of "break" to handle non-initialized
> > > virtqueue case.
> > > 
> > > v2 -> v3:
> > >   - update commit message for the 
> > > "vhost: recheck dev state in the vhost_migration_log routine" commit
> > >   - rename "started" field of the VhostUserBlk structure to
> > > "started_vu", so there will be no confustion with the VHOST started
> > > field
> > >   - update vhost-user-test.c to always initialize nq local variable
> > > (spotted by patchew)
> > > 
> > > v1 -> v2:
> > >   - add comments to connected/started fields in the header file
> > >   - move the "s->started" logic from the vhost_user_blk_disconnect
> > > routine to the vhost_user_blk_stop routine
> > > 
> > > Reference e-mail threads:
> > >   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg01509.html
> > >   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg05241.html
> > > 
> > > If vhost-user daemon is used as a backend for the vhost device, then we
> > > should consider a possibility of disconnect at any moment. There was a 
> > > general
> > > question here: should we consider it as an error or okay state for the 
> > > vhost-user
> > > devices during migration process?
> > > I think the disconnect event for the vhost-user devices should not break 
> > > the
> > > migration process, because:
> > >   - the device will be in the stopped state, so it will not be changed
> > > during migration
> > >   - if reconnect will be made the migration log will be reinitialized as
> > > part of reconnect/init process:
> > > #0  vhost_log_global_start (listener=0x563989cf7be0)
> > > at hw/virtio/vhost.c:920
> > > #1  0x56398603d8bc in listener_add_address_space 
> > > (listener=0x563989cf7be0,
> > > as=0x563986ea4340 )
> > > at softmmu/memory.c:2664
> > > #2  0x56398603dd30 in memory_listener_register 
> > > (listener=0x563989cf7be0,
> > > as=0x563986ea4340 )
> > > at softmmu/memory.c:2740
> > > #3  0x563985fd6956 in vhost_dev_init (hdev=0x563989cf7bd8,
> > > opaque=0x563989cf7e30, backend_type=VHOST_BACKEND_TYPE_USER,
> > > busyloop_timeout=0)
> > > at hw/virtio/vhost.c:1385
> > > #4  0x563985f7d0b8 in vhost_user_blk_connect (dev=0x563989cf7990)
> > > at hw/block/vhost-user-blk.c:315
> > > #5  0x563985f7d3f6 in vhost_user_blk_event (opaque=0x563989cf7990,
> > > event=CHR_EVENT_OPENED)
> > > at hw/block/vhost-user-blk.c:379
> > > The first patch in the patchset fixes this issue by setting vhost device 
> > > to the
> > > stopped state in the disconnect handler and check it the 
> > > vhost_migration_log()
> > > routine before returning from the function.
> > > qtest framework was updated to test vhost-user-blk functionality. The
> > > vhost-user-blk/vhost-user-blk-tests/migrate_reconnect test was added to 
> > > reproduce
> > > the original issue found.
> > > 
> > > Dima Stepanov (7):
> > >   vhost: recheck dev state in the vhost_migration_log routine
> > >   vhost: check queue state in the vhost_dev_set_log routine
> > >   tests/qtest/vhost-user-test: prepare the tests for adding new dev
> > > class
> > >   tests/qtest/libqos/virtio-blk: add support for vhost-user-blk
> > >   tests/qtest/vhost-user-test: add support for the vhost-user-blk device
> > >   tests/qtest/vhost-user-test: add migrate_reconnect test
> > >   tests/qtest/vhost-user-test: enable the reconnect tests
> > > 
> > >  hw/block/vhost-user-blk.c  |  19 ++-
> > >  hw/virtio/vhost.c  |  39 -
> > >  include/hw/virtio/vhost-user-blk.h |  10 ++
> > >  tests/qtest/libqos/virtio-blk.c|  14 +-
> > >  tests/qtest/vhost-user-test.c   

Re: [PATCH v5 0/7] vhost-user-blk: fix the migration issue and enhance qtests

2020-09-27 Thread Dima Stepanov
On Thu, Sep 24, 2020 at 07:26:14AM -0400, Michael S. Tsirkin wrote:
> On Fri, Sep 11, 2020 at 11:39:42AM +0300, Dima Stepanov wrote:
> > v4 -> v5:
> >   - vhost: check queue state in the vhost_dev_set_log routine
> > tests/qtest/vhost-user-test: prepare the tests for adding new
> > dev class
> > tests/qtest/vhost-user-test: add support for the
> > vhost-user-blk device
> > tests/qtest/vhost-user-test: add migrate_reconnect test
> > Reviewed-by: Raphael Norwitz
> >   - Update qtest, by merging vhost-user-blk "if" case with the
> > virtio-blk case.
> 
> I dropped patches 3-7 since they were stalling on some systems.
> Pls work with Peter Maydell (cc'd) to figure it out.
Thanks!

Peter, can you share any details for the stalling errors with me?

> 
> 
> > v3 -> v4:
> >   - vhost: recheck dev state in the vhost_migration_log routine
> > Reviewed-by: Raphael Norwitz
> >   - vhost: check queue state in the vhost_dev_set_log routine
> > Use "continue" instead of "break" to handle non-initialized
> > virtqueue case.
> > 
> > v2 -> v3:
> >   - update commit message for the 
> > "vhost: recheck dev state in the vhost_migration_log routine" commit
> >   - rename "started" field of the VhostUserBlk structure to
> > "started_vu", so there will be no confustion with the VHOST started
> > field
> >   - update vhost-user-test.c to always initialize nq local variable
> > (spotted by patchew)
> > 
> > v1 -> v2:
> >   - add comments to connected/started fields in the header file
> >   - move the "s->started" logic from the vhost_user_blk_disconnect
> > routine to the vhost_user_blk_stop routine
> > 
> > Reference e-mail threads:
> >   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg01509.html
> >   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg05241.html
> > 
> > If vhost-user daemon is used as a backend for the vhost device, then we
> > should consider a possibility of disconnect at any moment. There was a 
> > general
> > question here: should we consider it as an error or okay state for the 
> > vhost-user
> > devices during migration process?
> > I think the disconnect event for the vhost-user devices should not break the
> > migration process, because:
> >   - the device will be in the stopped state, so it will not be changed
> > during migration
> >   - if reconnect will be made the migration log will be reinitialized as
> > part of reconnect/init process:
> > #0  vhost_log_global_start (listener=0x563989cf7be0)
> > at hw/virtio/vhost.c:920
> > #1  0x56398603d8bc in listener_add_address_space 
> > (listener=0x563989cf7be0,
> > as=0x563986ea4340 )
> > at softmmu/memory.c:2664
> > #2  0x56398603dd30 in memory_listener_register 
> > (listener=0x563989cf7be0,
> > as=0x563986ea4340 )
> > at softmmu/memory.c:2740
> > #3  0x563985fd6956 in vhost_dev_init (hdev=0x563989cf7bd8,
> > opaque=0x563989cf7e30, backend_type=VHOST_BACKEND_TYPE_USER,
> > busyloop_timeout=0)
> > at hw/virtio/vhost.c:1385
> > #4  0x563985f7d0b8 in vhost_user_blk_connect (dev=0x563989cf7990)
> > at hw/block/vhost-user-blk.c:315
> > #5  0x563985f7d3f6 in vhost_user_blk_event (opaque=0x563989cf7990,
> > event=CHR_EVENT_OPENED)
> > at hw/block/vhost-user-blk.c:379
> > The first patch in the patchset fixes this issue by setting vhost device to 
> > the
> > stopped state in the disconnect handler and check it the 
> > vhost_migration_log()
> > routine before returning from the function.
> > qtest framework was updated to test vhost-user-blk functionality. The
> > vhost-user-blk/vhost-user-blk-tests/migrate_reconnect test was added to 
> > reproduce
> > the original issue found.
> > 
> > Dima Stepanov (7):
> >   vhost: recheck dev state in the vhost_migration_log routine
> >   vhost: check queue state in the vhost_dev_set_log routine
> >   tests/qtest/vhost-user-test: prepare the tests for adding new dev
> > class
> >   tests/qtest/libqos/virtio-blk: add support for vhost-user-blk
> >   tests/qtest/vhost-user-test: add support for the vhost-user-blk device
> >   tests/qtest/vhost-user-test: add migrate_reconnect test
> >   tests/qtest/vhost-user-test: enable the reconnect tests
> > 
> >  hw/block/vhost-user-blk.c  |  19 ++-
> >  hw/virtio/vhost.c  |  39 -
> >  include/hw/virtio/vhost-user-blk.h |  10 ++
> >  tests/qtest/libqos/virtio-blk.c|  14 +-
> >  tests/qtest/vhost-user-test.c  | 290 
> > +++--
> >  5 files changed, 322 insertions(+), 50 deletions(-)
> > 
> > -- 
> > 2.7.4
> 



Re: [PATCH v5 0/7] vhost-user-blk: fix the migration issue and enhance qtests

2020-09-24 Thread Michael S. Tsirkin
On Fri, Sep 11, 2020 at 11:39:42AM +0300, Dima Stepanov wrote:
> v4 -> v5:
>   - vhost: check queue state in the vhost_dev_set_log routine
> tests/qtest/vhost-user-test: prepare the tests for adding new
> dev class
> tests/qtest/vhost-user-test: add support for the
> vhost-user-blk device
> tests/qtest/vhost-user-test: add migrate_reconnect test
> Reviewed-by: Raphael Norwitz
>   - Update qtest, by merging vhost-user-blk "if" case with the
> virtio-blk case.

I dropped patches 3-7 since they were stalling on some systems.
Pls work with Peter Maydell (cc'd) to figure it out.


> v3 -> v4:
>   - vhost: recheck dev state in the vhost_migration_log routine
> Reviewed-by: Raphael Norwitz
>   - vhost: check queue state in the vhost_dev_set_log routine
> Use "continue" instead of "break" to handle non-initialized
> virtqueue case.
> 
> v2 -> v3:
>   - update commit message for the 
> "vhost: recheck dev state in the vhost_migration_log routine" commit
>   - rename "started" field of the VhostUserBlk structure to
> "started_vu", so there will be no confustion with the VHOST started
> field
>   - update vhost-user-test.c to always initialize nq local variable
> (spotted by patchew)
> 
> v1 -> v2:
>   - add comments to connected/started fields in the header file
>   - move the "s->started" logic from the vhost_user_blk_disconnect
> routine to the vhost_user_blk_stop routine
> 
> Reference e-mail threads:
>   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg01509.html
>   - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg05241.html
> 
> If vhost-user daemon is used as a backend for the vhost device, then we
> should consider a possibility of disconnect at any moment. There was a general
> question here: should we consider it as an error or okay state for the 
> vhost-user
> devices during migration process?
> I think the disconnect event for the vhost-user devices should not break the
> migration process, because:
>   - the device will be in the stopped state, so it will not be changed
> during migration
>   - if reconnect will be made the migration log will be reinitialized as
> part of reconnect/init process:
> #0  vhost_log_global_start (listener=0x563989cf7be0)
> at hw/virtio/vhost.c:920
> #1  0x56398603d8bc in listener_add_address_space 
> (listener=0x563989cf7be0,
> as=0x563986ea4340 )
> at softmmu/memory.c:2664
> #2  0x56398603dd30 in memory_listener_register 
> (listener=0x563989cf7be0,
> as=0x563986ea4340 )
> at softmmu/memory.c:2740
> #3  0x563985fd6956 in vhost_dev_init (hdev=0x563989cf7bd8,
> opaque=0x563989cf7e30, backend_type=VHOST_BACKEND_TYPE_USER,
> busyloop_timeout=0)
> at hw/virtio/vhost.c:1385
> #4  0x563985f7d0b8 in vhost_user_blk_connect (dev=0x563989cf7990)
> at hw/block/vhost-user-blk.c:315
> #5  0x563985f7d3f6 in vhost_user_blk_event (opaque=0x563989cf7990,
> event=CHR_EVENT_OPENED)
> at hw/block/vhost-user-blk.c:379
> The first patch in the patchset fixes this issue by setting vhost device to 
> the
> stopped state in the disconnect handler and check it the vhost_migration_log()
> routine before returning from the function.
> qtest framework was updated to test vhost-user-blk functionality. The
> vhost-user-blk/vhost-user-blk-tests/migrate_reconnect test was added to 
> reproduce
> the original issue found.
> 
> Dima Stepanov (7):
>   vhost: recheck dev state in the vhost_migration_log routine
>   vhost: check queue state in the vhost_dev_set_log routine
>   tests/qtest/vhost-user-test: prepare the tests for adding new dev
> class
>   tests/qtest/libqos/virtio-blk: add support for vhost-user-blk
>   tests/qtest/vhost-user-test: add support for the vhost-user-blk device
>   tests/qtest/vhost-user-test: add migrate_reconnect test
>   tests/qtest/vhost-user-test: enable the reconnect tests
> 
>  hw/block/vhost-user-blk.c  |  19 ++-
>  hw/virtio/vhost.c  |  39 -
>  include/hw/virtio/vhost-user-blk.h |  10 ++
>  tests/qtest/libqos/virtio-blk.c|  14 +-
>  tests/qtest/vhost-user-test.c  | 290 
> +++--
>  5 files changed, 322 insertions(+), 50 deletions(-)
> 
> -- 
> 2.7.4




[PATCH v5 0/7] vhost-user-blk: fix the migration issue and enhance qtests

2020-09-11 Thread Dima Stepanov
v4 -> v5:
  - vhost: check queue state in the vhost_dev_set_log routine
tests/qtest/vhost-user-test: prepare the tests for adding new
dev class
tests/qtest/vhost-user-test: add support for the
vhost-user-blk device
tests/qtest/vhost-user-test: add migrate_reconnect test
Reviewed-by: Raphael Norwitz
  - Update qtest, by merging vhost-user-blk "if" case with the
virtio-blk case.

v3 -> v4:
  - vhost: recheck dev state in the vhost_migration_log routine
Reviewed-by: Raphael Norwitz
  - vhost: check queue state in the vhost_dev_set_log routine
Use "continue" instead of "break" to handle non-initialized
virtqueue case.

v2 -> v3:
  - update commit message for the 
"vhost: recheck dev state in the vhost_migration_log routine" commit
  - rename "started" field of the VhostUserBlk structure to
"started_vu", so there will be no confustion with the VHOST started
field
  - update vhost-user-test.c to always initialize nq local variable
(spotted by patchew)

v1 -> v2:
  - add comments to connected/started fields in the header file
  - move the "s->started" logic from the vhost_user_blk_disconnect
routine to the vhost_user_blk_stop routine

Reference e-mail threads:
  - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg01509.html
  - https://lists.gnu.org/archive/html/qemu-devel/2020-05/msg05241.html

If vhost-user daemon is used as a backend for the vhost device, then we
should consider a possibility of disconnect at any moment. There was a general
question here: should we consider it as an error or okay state for the 
vhost-user
devices during migration process?
I think the disconnect event for the vhost-user devices should not break the
migration process, because:
  - the device will be in the stopped state, so it will not be changed
during migration
  - if reconnect will be made the migration log will be reinitialized as
part of reconnect/init process:
#0  vhost_log_global_start (listener=0x563989cf7be0)
at hw/virtio/vhost.c:920
#1  0x56398603d8bc in listener_add_address_space 
(listener=0x563989cf7be0,
as=0x563986ea4340 )
at softmmu/memory.c:2664
#2  0x56398603dd30 in memory_listener_register (listener=0x563989cf7be0,
as=0x563986ea4340 )
at softmmu/memory.c:2740
#3  0x563985fd6956 in vhost_dev_init (hdev=0x563989cf7bd8,
opaque=0x563989cf7e30, backend_type=VHOST_BACKEND_TYPE_USER,
busyloop_timeout=0)
at hw/virtio/vhost.c:1385
#4  0x563985f7d0b8 in vhost_user_blk_connect (dev=0x563989cf7990)
at hw/block/vhost-user-blk.c:315
#5  0x563985f7d3f6 in vhost_user_blk_event (opaque=0x563989cf7990,
event=CHR_EVENT_OPENED)
at hw/block/vhost-user-blk.c:379
The first patch in the patchset fixes this issue by setting vhost device to the
stopped state in the disconnect handler and check it the vhost_migration_log()
routine before returning from the function.
qtest framework was updated to test vhost-user-blk functionality. The
vhost-user-blk/vhost-user-blk-tests/migrate_reconnect test was added to 
reproduce
the original issue found.

Dima Stepanov (7):
  vhost: recheck dev state in the vhost_migration_log routine
  vhost: check queue state in the vhost_dev_set_log routine
  tests/qtest/vhost-user-test: prepare the tests for adding new dev
class
  tests/qtest/libqos/virtio-blk: add support for vhost-user-blk
  tests/qtest/vhost-user-test: add support for the vhost-user-blk device
  tests/qtest/vhost-user-test: add migrate_reconnect test
  tests/qtest/vhost-user-test: enable the reconnect tests

 hw/block/vhost-user-blk.c  |  19 ++-
 hw/virtio/vhost.c  |  39 -
 include/hw/virtio/vhost-user-blk.h |  10 ++
 tests/qtest/libqos/virtio-blk.c|  14 +-
 tests/qtest/vhost-user-test.c  | 290 +++--
 5 files changed, 322 insertions(+), 50 deletions(-)

-- 
2.7.4