On Thu, May 13, 2021 at 11:43:25AM +0200, Philippe Mathieu-Daudé wrote: > Cc'ing few developers > > On 5/11/21 3:33 PM, Boeuf, Sebastien wrote: > > Hi All, > > > > In the context of vhost-user, I was wondering how a reconnection should > > be handled from the VMM perspective? > > > > In particular, I'm looking at the OVS-DPDK use case using the client > > mode (meaning QEMU acts as the server), and I'd like to understand what > > QEMU does to handle this. Upon disconnection from the backend, does QEMU > > reset the virtio device (meaning the guest > > is notified about it)? And upon the new connection from the backend, > > does QEMU go through the whole vhost-user initialization once again > > (feature acknowledgement, setup of vrings, etc...), or does it simply > > assume the backend will have saved all these information?
I started a virtio-fs-specific email thread about vhost-user reconnection here after your IRC messages the other day: https://listman.redhat.com/archives/virtio-fs/2021-May/msg00105.html The VMM does not reset the device. In general vhost-user reconnection is transparent to the guest. While the device is disconnected the virtqueues are not processed. Upon reconnection the vhost-user protocol traffic is almost identical to a fresh connection. The VMM negotiates features, send memory regions, etc. The device backend needs to persist device-specific state. This is why reconnection and crash recovery are device-specific (and to some extent implementation-specific because not every device may support it or implement it in the same way). Stateless devices are easiest to support. vhost-user-net and vhost-user-blk are the only devices I'm aware of that support reconnection today. Expect to encounter bugs. Reconnection is underspecified and leaves a lot to the vhost-user implementation. There might also be design flaws around synchronous VIRTIO transport hardware registers where either the vCPU hangs because it needs a response from the disconnected device or the VMM ignores the disconnected device in order to avoid hanging the vCPU thread but the state is out of sync upon reconnection. It's hard to test all possible states so I doubt it's bullet-proof. It probably works best for vhost-user-net and less well for other device types. Stefan
signature.asc
Description: PGP signature