On 25.09.23 22:48, Stefan Hajnoczi wrote:
On Fri, Sep 15, 2023 at 12:25:25PM +0200, Hanna Czenczek wrote:
RFC:
https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg04263.html
v1:
https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg01575.html
v2:
https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg02604.html
Hi,
I’ve decided not to work on vhost-user SUSPEND/RESUME for now – it is
not technically required for virtio-fs migration, which is the actual
priority for me now. While we do want to have SUSPEND/RESUME at some
point, the only practically existing reason for it is to be able to
implement vhost-level resetting in virtiofsd, but that is not related to
migration.
QEMU sends VHOST_USER_SET_STATUS 0 in vhost_dev_stop(). Are you assuming
that virtiofs back-ends do not reset the device upon receiving this
message?
Absolutely. vhost_dev_stop() is not in the migration-specific path, but
is called whenever the vCPUs are stopped, which indeed happens to be
part of migration, but is also used in other cases, like QMP stop. We
have identified that it is wrong that we reset the back-end just because
the vCPUs are stopped (e.g. on migration), but it is what we do right
now when the VM is paused (e.g. through QMP stop).
Therefore, stateful back-ends cannot implement reset lest stop/cont
breaks the device. I don’t think anybody really cares whether a
vhost-user back-end actually resets its internal state (if there is any)
when the guest driver asks for a reset on the virtio level, as long as
the guest driver is then able to initialize the device afterwards. I do
think people care that stop/cont works, so it follows to me that no
stateful back-end will reset its internal state when receiving a
virtio/vhost reset. If they do, stop/cont breaks, which is a
user-visible bug that needs to be addressed – either properly by
implementing SUSPEND/RESUME in both qemu and the affected back-ends, or
by using a similar work-around to virtiofsd, which is to ignore reset
commands.
It’s hard for me to imagine that people don’t care that stop/cont breaks
some vhost-user back-end, but suddenly would start to care that
migration doesn’t work – especially given that first of all someone will
need to manually implement any migration support in that back-end even
with this series, which means that really, the only back-end we are
talking about here is our virtiofsd. To this day I’m not even aware of
any other back-end that has internal state.
Hanna