Re: [Virtio-fs] [PATCH v3 0/5] vhost-user: Back-end state migration

2023-09-27 Thread Stefan Hajnoczi
On Wed, Sep 27, 2023 at 10:32:14AM +0200, Hanna Czenczek wrote:
> On 26.09.23 22:10, Stefan Hajnoczi wrote:
> > Hi Hanna,
> > I was thinking about how this could work without SUSPEND/RESUME. What
> > do you think of the following?
> > 
> > 1. The front-end sends VHOST_USER_RESET_DEVICE (or
> > VHOST_USER_RESET_OWNER, when necessary) when the guest driver resets
> > the device but not on vhost_dev_start()/vhost_dev_stop().
> 
> This is half the work of SUSPEND/RESUME.  It isn’t easy to do.

I sent a patch series to bring VHOST_USER_RESET_DEVICE to all vhost-user
devices:
https://lore.kernel.org/qemu-devel/20230927192737.528280-1-stefa...@redhat.com/T/#t

> 
> > 2. Suspend the device when all virtqueues are stopped via
> > VHOST_USER_GET_VRING_BASE. Resume the device after at least one
> > virtqueue is started and enabled.
> > 3. Ignore VHOST_USER_SET_STATUS.
> > 
> > Reset would work. The device would suspend and resume without losing
> > state. Existing vhost-user backends already behave like this in
> > practice (they often don't implement RESET_DEVICE).
> 
> I don’t understand the point, though.  Today, reset in practice is a no-op
> anyway, precisely because we only send SET_STATUS 0, don’t fall back to
> RESET_OWNER/RESET_DEVICE, and no back-end implements SET_STATUS 0 as a
> reset.  By sending RESET_* in case of a guest-initiated reset and nothing in
> case of stop/cont, we effectively don’t change anything about the latter
> (which is what SUSPEND/RESUME would be for), but only fix the former case. 
> While I agree that it’s wrong that we don’t really reset the back-end in
> case of a guest-initiated reset, this is the first time in this whole
> discussion that that part has been presented as a problem that needs fixing
> now.
> 
> So the proposal effectively changes nothing for the vhost_dev_stop()/start()
> case where we’d want to make use of SUSPEND/RESUME, but only for the case
> where we would not use it.

We discussed this on a call today. 2 & 3 are additions to the spec that
Hanna has agreed to work on.

Stefan


signature.asc
Description: PGP signature


Re: [Virtio-fs] [PATCH v3 0/5] vhost-user: Back-end state migration

2023-09-27 Thread Hanna Czenczek

On 26.09.23 22:10, Stefan Hajnoczi wrote:

Hi Hanna,
I was thinking about how this could work without SUSPEND/RESUME. What
do you think of the following?

1. The front-end sends VHOST_USER_RESET_DEVICE (or
VHOST_USER_RESET_OWNER, when necessary) when the guest driver resets
the device but not on vhost_dev_start()/vhost_dev_stop().


This is half the work of SUSPEND/RESUME.  It isn’t easy to do.


2. Suspend the device when all virtqueues are stopped via
VHOST_USER_GET_VRING_BASE. Resume the device after at least one
virtqueue is started and enabled.
3. Ignore VHOST_USER_SET_STATUS.

Reset would work. The device would suspend and resume without losing
state. Existing vhost-user backends already behave like this in
practice (they often don't implement RESET_DEVICE).


I don’t understand the point, though.  Today, reset in practice is a 
no-op anyway, precisely because we only send SET_STATUS 0, don’t fall 
back to
RESET_OWNER/RESET_DEVICE, and no back-end implements SET_STATUS 0 as a 
reset.  By sending RESET_* in case of a guest-initiated reset and 
nothing in case of stop/cont, we effectively don’t change anything about 
the latter (which is what SUSPEND/RESUME would be for), but only fix the 
former case.  While I agree that it’s wrong that we don’t really reset 
the back-end in case of a guest-initiated reset, this is the first time 
in this whole discussion that that part has been presented as a problem 
that needs fixing now.


So the proposal effectively changes nothing for the 
vhost_dev_stop()/start() case where we’d want to make use of 
SUSPEND/RESUME, but only for the case where we would not use it.


Hanna




Re: [Virtio-fs] [PATCH v3 0/5] vhost-user: Back-end state migration

2023-09-27 Thread Hanna Czenczek

On 26.09.23 21:20, Stefan Hajnoczi wrote:



On Tue, Sep 26, 2023, 09:33 Hanna Czenczek  wrote:

On 25.09.23 22:48, Stefan Hajnoczi wrote:
> On Fri, Sep 15, 2023 at 12:25:25PM +0200, Hanna Czenczek wrote:
>> RFC:
>>
https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg04263.html
>>
>> v1:
>>
https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg01575.html
>>
>> v2:
>>
https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg02604.html
>>
>> Hi,
>>
>> I’ve decided not to work on vhost-user SUSPEND/RESUME for now –
it is
>> not technically required for virtio-fs migration, which is the
actual
>> priority for me now.  While we do want to have SUSPEND/RESUME
at some
>> point, the only practically existing reason for it is to be able to
>> implement vhost-level resetting in virtiofsd, but that is not
related to
>> migration.
> QEMU sends VHOST_USER_SET_STATUS 0 in vhost_dev_stop(). Are you
assuming
> that virtiofs back-ends do not reset the device upon receiving this
> message?


Absolutely.  vhost_dev_stop() is not in the migration-specific
path, but
is called whenever the vCPUs are stopped, which indeed happens to be
part of migration, but is also used in other cases, like QMP
stop.  We
have identified that it is wrong that we reset the back-end just
because
the vCPUs are stopped (e.g. on migration), but it is what we do right
now when the VM is paused (e.g. through QMP stop).

Therefore, stateful back-ends cannot implement reset lest stop/cont
breaks the device.  I don’t think anybody really cares whether a
vhost-user back-end actually resets its internal state (if there
is any)
when the guest driver asks for a reset on the virtio level, as
long as
the guest driver is then able to initialize the device afterwards.


Some devices send configuration change notifications. For example, 
virtio-net speed and duplex changes.


Imagine a network boot ROM runs and the firmware resets the virtio-net 
device when transferring control to the loaded kernel. Before the 
kernel driver initializes the device again, the vhost-user-net 
back-end reports a speed or duplex change and sends a Configuration 
Change Notification to the guest. The guest receives a spurious 
interrupt because the vhost-user-net device wasn't actually reset.


I don’t see how this relates to my argument that no stateful back-end 
can implement a full reset because doing so would break stop/cont.


If vhost-user-net were stateful (which it isn’t, AFAIK), it could choose 
to implement a work-around such that it would stop sending notifications 
on reset, but not reset its internal state.  Then, when qemu restores 
vring state in vhost_dev_start(), it would resume sending 
notifications.  But again, I fail to see how this is not already an 
issue for stop/cont.


I'm concerned that ignoring reset matters (admittedly in corner cases) 
and think that stateful device functionality shouldn't be added to the 
vhost-user protocol without a solution for reset.


I disagree.  We have a stateful device already, whether we add 
functionality acknowledging this to the protocol or not.  The problem 
exists.  It is independent of migration.  If there’s a problem because 
of this with migration, there’s a problem with stop/cont, too, that must 
already have been worked around.


This patch series changes the vhost-user protocol, which is used by 
many different devices, not just virtiofs. The solution should work 
for vhost-user devices of any type and not be based on what we can get 
away with when running the current QEMU + virtiofsd.


My argument was generic.  Any existing stateful device implementation 
must implement reset in such a way that it won’t break stop/cont, i.e., 
it must not reset its internal state.




I do
think people care that stop/cont works, so it follows to me that no
stateful back-end will reset its internal state when receiving a
virtio/vhost reset.  If they do, stop/cont breaks, which is a
user-visible bug that needs to be addressed – either properly by
implementing SUSPEND/RESUME in both qemu and the affected
back-ends, or
by using a similar work-around to virtiofsd, which is to ignore reset
commands.


Properly, please.


You misunderstand me.  I’m not presenting the choice I have.  I’m 
presenting the choices existing implementations *have had until this 
point*.  *None* chose to do it properly.  I don’t know of stateful 
implementations besides virtiofsd, but virtiofsd chose to be content 
with not implementing reset and thus having things “just work”.


The work-arounds must exist already.




It’s hard for me to imagine that people don’t care that stop/cont
breaks
some vhost-user back-end, but suddenly would start to care that
migration doesn’t work – especially given that first of al

Re: [Virtio-fs] [PATCH v3 0/5] vhost-user: Back-end state migration

2023-09-26 Thread Stefan Hajnoczi
Hi Hanna,
I was thinking about how this could work without SUSPEND/RESUME. What
do you think of the following?

1. The front-end sends VHOST_USER_RESET_DEVICE (or
VHOST_USER_RESET_OWNER, when necessary) when the guest driver resets
the device but not on vhost_dev_start()/vhost_dev_stop().
2. Suspend the device when all virtqueues are stopped via
VHOST_USER_GET_VRING_BASE. Resume the device after at least one
virtqueue is started and enabled.
3. Ignore VHOST_USER_SET_STATUS.

Reset would work. The device would suspend and resume without losing
state. Existing vhost-user backends already behave like this in
practice (they often don't implement RESET_DEVICE).

It's close enough to what you're proposing that it doesn't require
much additional work, but I think it covers the cases.

Two concerns:
1. It's specific to vhost-user and diverges from vDPA.
2. VHOST_USER_SET_STATUS might be needed in the future even though
it's not useful today.

Stefan



Re: [Virtio-fs] [PATCH v3 0/5] vhost-user: Back-end state migration

2023-09-26 Thread Stefan Hajnoczi
On Tue, Sep 26, 2023, 09:33 Hanna Czenczek  wrote:

> On 25.09.23 22:48, Stefan Hajnoczi wrote:
> > On Fri, Sep 15, 2023 at 12:25:25PM +0200, Hanna Czenczek wrote:
> >> RFC:
> >> https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg04263.html
> >>
> >> v1:
> >> https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg01575.html
> >>
> >> v2:
> >> https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg02604.html
> >>
> >> Hi,
> >>
> >> I’ve decided not to work on vhost-user SUSPEND/RESUME for now – it is
> >> not technically required for virtio-fs migration, which is the actual
> >> priority for me now.  While we do want to have SUSPEND/RESUME at some
> >> point, the only practically existing reason for it is to be able to
> >> implement vhost-level resetting in virtiofsd, but that is not related to
> >> migration.
> > QEMU sends VHOST_USER_SET_STATUS 0 in vhost_dev_stop(). Are you assuming
> > that virtiofs back-ends do not reset the device upon receiving this
> > message?
>

> Absolutely.  vhost_dev_stop() is not in the migration-specific path, but
> is called whenever the vCPUs are stopped, which indeed happens to be
> part of migration, but is also used in other cases, like QMP stop.  We
> have identified that it is wrong that we reset the back-end just because
> the vCPUs are stopped (e.g. on migration), but it is what we do right
> now when the VM is paused (e.g. through QMP stop).
>
> Therefore, stateful back-ends cannot implement reset lest stop/cont
> breaks the device.  I don’t think anybody really cares whether a
> vhost-user back-end actually resets its internal state (if there is any)
> when the guest driver asks for a reset on the virtio level, as long as
> the guest driver is then able to initialize the device afterwards.


Some devices send configuration change notifications. For example,
virtio-net speed and duplex changes.

Imagine a network boot ROM runs and the firmware resets the virtio-net
device when transferring control to the loaded kernel. Before the kernel
driver initializes the device again, the vhost-user-net back-end reports a
speed or duplex change and sends a Configuration Change Notification to the
guest. The guest receives a spurious interrupt because the vhost-user-net
device wasn't actually reset.

I'm concerned that ignoring reset matters (admittedly in corner cases) and
think that stateful device functionality shouldn't be added to the
vhost-user protocol without a solution for reset. This patch series changes
the vhost-user protocol, which is used by many different devices, not just
virtiofs. The solution should work for vhost-user devices of any type and
not be based on what we can get away with when running the current QEMU +
virtiofsd.

I do
> think people care that stop/cont works, so it follows to me that no
> stateful back-end will reset its internal state when receiving a
> virtio/vhost reset.  If they do, stop/cont breaks, which is a
> user-visible bug that needs to be addressed – either properly by
> implementing SUSPEND/RESUME in both qemu and the affected back-ends, or
> by using a similar work-around to virtiofsd, which is to ignore reset
> commands.
>

Properly, please.


> It’s hard for me to imagine that people don’t care that stop/cont breaks
> some vhost-user back-end, but suddenly would start to care that
> migration doesn’t work – especially given that first of all someone will
> need to manually implement any migration support in that back-end even
> with this series, which means that really, the only back-end we are
> talking about here is our virtiofsd.  To this day I’m not even aware of
> any other back-end that has internal state.
>

Another one I can think of is vhost-user-gpu.

Why did you give up on implementing SUSPEND/RESUME?

Stefan


> Hanna
>
>
>


Re: [Virtio-fs] [PATCH v3 0/5] vhost-user: Back-end state migration

2023-09-26 Thread Hanna Czenczek

On 25.09.23 22:48, Stefan Hajnoczi wrote:

On Fri, Sep 15, 2023 at 12:25:25PM +0200, Hanna Czenczek wrote:

RFC:
https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg04263.html

v1:
https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg01575.html

v2:
https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg02604.html

Hi,

I’ve decided not to work on vhost-user SUSPEND/RESUME for now – it is
not technically required for virtio-fs migration, which is the actual
priority for me now.  While we do want to have SUSPEND/RESUME at some
point, the only practically existing reason for it is to be able to
implement vhost-level resetting in virtiofsd, but that is not related to
migration.

QEMU sends VHOST_USER_SET_STATUS 0 in vhost_dev_stop(). Are you assuming
that virtiofs back-ends do not reset the device upon receiving this
message?


Absolutely.  vhost_dev_stop() is not in the migration-specific path, but 
is called whenever the vCPUs are stopped, which indeed happens to be 
part of migration, but is also used in other cases, like QMP stop.  We 
have identified that it is wrong that we reset the back-end just because 
the vCPUs are stopped (e.g. on migration), but it is what we do right 
now when the VM is paused (e.g. through QMP stop).


Therefore, stateful back-ends cannot implement reset lest stop/cont 
breaks the device.  I don’t think anybody really cares whether a 
vhost-user back-end actually resets its internal state (if there is any) 
when the guest driver asks for a reset on the virtio level, as long as 
the guest driver is then able to initialize the device afterwards.  I do 
think people care that stop/cont works, so it follows to me that no 
stateful back-end will reset its internal state when receiving a 
virtio/vhost reset.  If they do, stop/cont breaks, which is a 
user-visible bug that needs to be addressed – either properly by 
implementing SUSPEND/RESUME in both qemu and the affected back-ends, or 
by using a similar work-around to virtiofsd, which is to ignore reset 
commands.


It’s hard for me to imagine that people don’t care that stop/cont breaks 
some vhost-user back-end, but suddenly would start to care that 
migration doesn’t work – especially given that first of all someone will 
need to manually implement any migration support in that back-end even 
with this series, which means that really, the only back-end we are 
talking about here is our virtiofsd.  To this day I’m not even aware of 
any other back-end that has internal state.


Hanna




Re: [PATCH v3 0/5] vhost-user: Back-end state migration

2023-09-25 Thread Stefan Hajnoczi
On Fri, Sep 15, 2023 at 12:25:25PM +0200, Hanna Czenczek wrote:
> RFC:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg04263.html
> 
> v1:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg01575.html
> 
> v2:
> https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg02604.html
> 
> Hi,
> 
> I’ve decided not to work on vhost-user SUSPEND/RESUME for now – it is
> not technically required for virtio-fs migration, which is the actual
> priority for me now.  While we do want to have SUSPEND/RESUME at some
> point, the only practically existing reason for it is to be able to
> implement vhost-level resetting in virtiofsd, but that is not related to
> migration.

QEMU sends VHOST_USER_SET_STATUS 0 in vhost_dev_stop(). Are you assuming
that virtiofs back-ends do not reset the device upon receiving this
message?

> So one of the changes in v3 is that it no longer depends on the
> vhost-user SUSPEND/RESUME series, and describes the migration protocol
> without the device being suspended at any point, but merely that the
> vrings are stopped.
> 
> Other changes include:
> 
> - Patch 1:
>   - Rephrased a lot
>   - Added a description for the VHOST_USER_SET_DEVICE_STATE_FD
> parameters
>   - Renamed VHOST_USER_PROTOCOL_F_MIGRATORY_STATE to
> VHOST_USER_PROTOCOL_F_DEVICE_STATE
>   - enum variants changed in value due to dropping the SUSPEND/RESUME
> dependency
> 
> - Patch 2:
>   - Pulled in, was a stand-alone patch before
>   - Dropped a sentence about ring state before feature negotiations, as
> the rings are not to be used during that period anyway
>   - Bit of rephrasing
> 
> - Patch 3:
>   - Renamed “migratory state” to “device state”
>   - enum variants changed in value due to dropping the SUSPEND/RESUME
> dependency
> 
> - Patch 4:
>   - Changed `f` to @f (referencing parameter “f”) in comments
>   - Use g_autofree for the transfer buffer
>   - Note SUSPEND state as a future feature, not currently existing
>   - Wrap read() and write() in RETRY_ON_EINTR()
> 
> - Patch 5:
>   - Renamed “migratory state” to “device state”
>   - (kept R-b still)
> 
> 
> Hanna Czenczek (5):
>   vhost-user.rst: Migrating back-end-internal state
>   vhost-user.rst: Clarify enabling/disabling vrings
>   vhost-user: Interface for migration state transfer
>   vhost: Add high-level state save/load functions
>   vhost-user-fs: Implement internal migration
> 
>  docs/interop/vhost-user.rst   | 188 ++-
>  include/hw/virtio/vhost-backend.h |  24 +++
>  include/hw/virtio/vhost.h | 114 ++
>  hw/virtio/vhost-user-fs.c | 101 -
>  hw/virtio/vhost-user.c| 148 ++
>  hw/virtio/vhost.c | 241 ++
>  6 files changed, 810 insertions(+), 6 deletions(-)
> 
> -- 
> 2.41.0
> 


signature.asc
Description: PGP signature


[PATCH v3 0/5] vhost-user: Back-end state migration

2023-09-15 Thread Hanna Czenczek
RFC:
https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg04263.html

v1:
https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg01575.html

v2:
https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg02604.html

Hi,

I’ve decided not to work on vhost-user SUSPEND/RESUME for now – it is
not technically required for virtio-fs migration, which is the actual
priority for me now.  While we do want to have SUSPEND/RESUME at some
point, the only practically existing reason for it is to be able to
implement vhost-level resetting in virtiofsd, but that is not related to
migration.

So one of the changes in v3 is that it no longer depends on the
vhost-user SUSPEND/RESUME series, and describes the migration protocol
without the device being suspended at any point, but merely that the
vrings are stopped.

Other changes include:

- Patch 1:
  - Rephrased a lot
  - Added a description for the VHOST_USER_SET_DEVICE_STATE_FD
parameters
  - Renamed VHOST_USER_PROTOCOL_F_MIGRATORY_STATE to
VHOST_USER_PROTOCOL_F_DEVICE_STATE
  - enum variants changed in value due to dropping the SUSPEND/RESUME
dependency

- Patch 2:
  - Pulled in, was a stand-alone patch before
  - Dropped a sentence about ring state before feature negotiations, as
the rings are not to be used during that period anyway
  - Bit of rephrasing

- Patch 3:
  - Renamed “migratory state” to “device state”
  - enum variants changed in value due to dropping the SUSPEND/RESUME
dependency

- Patch 4:
  - Changed `f` to @f (referencing parameter “f”) in comments
  - Use g_autofree for the transfer buffer
  - Note SUSPEND state as a future feature, not currently existing
  - Wrap read() and write() in RETRY_ON_EINTR()

- Patch 5:
  - Renamed “migratory state” to “device state”
  - (kept R-b still)


Hanna Czenczek (5):
  vhost-user.rst: Migrating back-end-internal state
  vhost-user.rst: Clarify enabling/disabling vrings
  vhost-user: Interface for migration state transfer
  vhost: Add high-level state save/load functions
  vhost-user-fs: Implement internal migration

 docs/interop/vhost-user.rst   | 188 ++-
 include/hw/virtio/vhost-backend.h |  24 +++
 include/hw/virtio/vhost.h | 114 ++
 hw/virtio/vhost-user-fs.c | 101 -
 hw/virtio/vhost-user.c| 148 ++
 hw/virtio/vhost.c | 241 ++
 6 files changed, 810 insertions(+), 6 deletions(-)

-- 
2.41.0