Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-20 Thread Zhu, Lingshan




On 9/20/2023 3:51 PM, Parav Pandit wrote:

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 1:17 PM

This is not live or device migration. This is restoring the device context

initiated by the driver owning the device.
restore the device context should be done by the hypervisor before setting
DRIVER_OK and waking up the guest, not a concern here, out of spec

The framework is generic for the PCI devices hence, there may not be any 
hypervisor at all. Hence restore operation to trigger on DRIVER_OK setting, 
when previously suspended.

Since we already agree in previous email that re-read until device sets the 
DRIVER_OK, its fine to me. It covers the above restore condition.

OK


Thanks.






Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-20 Thread Zhu, Lingshan




On 9/20/2023 3:35 PM, Parav Pandit wrote:

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 1:00 PM

On 9/20/2023 3:24 PM, Chen, Jiqian wrote:

Hi Lingshan,
It seems you reply to the wrong email thread. They are not related to my

patch.
These reply to Parva's comments.
@Parva, if you want to discuss more about live migration, please reply in my
thread, lets don't flood here.

You made the point that "this is not live migration".
I am not discussing live migration in this thread.

I replied for the point that device restoration from suspend for physical and 
hypevisor based device is not a 40nsec worth of work to restore by just doing a 
register write.
This is not live or device migration. This is restoring the device context 
initiated by the driver owning the device.
restore the device context should be done by the hypervisor before 
setting DRIVER_OK and waking up the guest, not a concern here, out of spec





Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-20 Thread Zhu, Lingshan




On 9/20/2023 3:32 PM, Parav Pandit wrote:



From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 12:58 PM

On 9/20/2023 3:10 PM, Parav Pandit wrote:

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 12:37 PM

The problem to overcome in [1] is, resume operation needs to be
synchronous

as it involves large part of context to resume back, and hence just
asynchronously setting DRIVER_OK is not enough.

The sw must verify back that device has resumed the operation and
ready to

answer requests.
this is not live migration, all device status and other information
still stay in the device, no need to "resume" context, just resume running.


I am aware that it is not live migration. :)

"Just resuming" involves lot of device setup task. The device implementation

does not know for how long a device is suspended.

So for example, a VM is suspended for 6 hours, hence the device context

could be saved in a slow disk.

Hence, when the resume is done, it needs to setup things again and driver got

to verify before accessing more from the device.
The restore procedures should perform by the hypervisor and done before set
DRIVER_OK and wake up the guest.

Which is the signal to trigger the restore? Which is the trigger in physical 
device when there is no hypervisor?

In my view, setting the DRIVER_OK is the signal regardless of hypervisor or 
physical device.
Hence the re-read is must.

Yes, as I said below, should verify by re-read.



And the hypervisor/driver needs to check the device status by re-reading.

Like resume from a failed LM.

This is slightly different flow than setting the DRIVER_OK for the
first time

device initialization sequence as it does not involve large restoration.

So, to merge two ideas, instead of doing DRIVER_OK to resume, the
driver

should clear the SUSPEND bit and verify that it is out of SUSPEND.

Because driver is still in _OK_ driving the device flipping the SUSPEND bit.

Please read the spec, it says:
The driver MUST NOT clear a device status bit


Yes, this is why either DRIER_OK validation by the driver is needed or Jiqian's

synchronous new register..
so re-read

Yes. re-read until set, Thanks.






Re: [virtio-comment] Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-20 Thread Zhu, Lingshan




On 9/20/2023 3:17 PM, Chen, Jiqian wrote:

Hi Lingshan,

On 2023/9/20 14:58, Zhu, Lingshan wrote:


On 9/20/2023 2:33 PM, Chen, Jiqian wrote:

Hi Lingshan,

On 2023/9/20 13:59, Zhu, Lingshan wrote:

On 9/19/2023 8:31 PM, Michael S. Tsirkin wrote:

On Tue, Sep 19, 2023 at 07:42:42PM +0800, Jiqian Chen wrote:

When guest vm does S3, Qemu will reset and clear some things of virtio
devices, but guest can't aware that, so that may cause some problems.
For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
resume, that function will destroy render resources of virtio-gpu. As
a result, after guest resume, the display can't come back and we only
saw a black screen. Due to guest can't re-create all the resources, so
we need to let Qemu not to destroy them when S3.

For above purpose, we need a mechanism that allows guests and QEMU to
negotiate their reset behavior. So this patch add a new parameter
named freeze_mode to struct virtio_pci_common_cfg. And when guest
suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
devices can change their reset behavior on Qemu side according to
freeze_mode. What's more, freeze_mode can be used for all virtio
devices to affect the behavior of Qemu, not just virtio gpu device.

Hi Jiqian,

Have you seen this series: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state
https://lore.kernel.org/all/3f4cbf84-010c-cffa-0b70-33c449b55...@intel.com/T/

We introduced a bit in the device status SUSPEND, when VIRTIO_F_SUSPEND is
negotiated, the driver can set SUSPEND in the device status to suspend the
device.

When SUSPEND, the device should pause its operations and preserve its 
configurations
in its configuration space.

The driver re-write DRIVER_OK to clear SUSPEND, so the device resumes running.

This is originally to serve live migration, but I think it can also meet your 
needs.

I noticed your series, but I am not sure they are also meet my needs.
If driver write 0 to reset device, can the SUSPEND bit be cleared? 
(pci_pm_resume->virtio_pci_restore->virtio_device_restore->virtio_reset_device)

if the driver writes 0, it resets all virtio functionalities. So SUSPEND is 
cleared.

Then your patches are not meet my needs. In my scene, it needs to keep the 
SUSPEND bit util the resume process is complete.
Because in my virtio-gpu scene, when guest resume, it call virtio_reset_device 
to clear all device status bits, and then reset virtio-gpu in Qemu, and then 
destroy render resources, I don't want the resources are destroyed during the 
resume process. So, I add freeze_mode to tell Qemu that guest is doing S3 and 
resources need to be kept.
When a guest set to S3, the hypervisor suspend the guest to RAM, and the 
passthrough-ed device are in low power state.
I am not sure the device can keep its status/context/data, maybe need to 
recover from RAM anyway.


So I suggest not reset the device, there need some code changes in QEMU
When set to S3, SUSPEND the device, then suspend to RAM
When resume from S3, restore from RAM and set DRIVER_OK to resume running.



device reset can also be used to recover the device from fatal errors, so it 
should reset everything in virtio.

If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if 
the reset request is from guest restore process or not, and then I can't change 
the reset behavior.

I think when enter S3, the hypervisor/driver should set SUSPEND to the device. 
And when resume from S3, the hypervisor/driver should
re-write DRIVER_OK to clear SUSPEND, then the device resume running.

Can you send me your patch link on kernel and qemu side? I will take a deep 
look.

There are no patches for qemu/kernel yet, spec first.

Thanks,
Zhu Lingshan

Signed-off-by: Jiqian Chen 
---
    transport-pci.tex | 7 +++
    1 file changed, 7 insertions(+)

diff --git a/transport-pci.tex b/transport-pci.tex
index a5c6719..2543536 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
    le64 queue_desc;    /* read-write */
    le64 queue_driver;  /* read-write */
    le64 queue_device;  /* read-write */
+    le16 freeze_mode;   /* read-write */
    le16 queue_notif_config_data;   /* read-only for driver */
    le16 queue_reset;   /* read-write */


we can't add fields in the middle of the structure like this -
offset of queue_notif_config_data and queue_reset changes.

   

@@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
    \item[\field{queue_device}]
    The driver writes the physical address of Device Area here.  See 
section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues}.
    +\item[\field{freeze_mode}]
+    The driver writes this to set the freeze mode of 

Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-20 Thread Zhu, Lingshan




On 9/20/2023 3:24 PM, Chen, Jiqian wrote:

Hi Lingshan,
It seems you reply to the wrong email thread. They are not related to my patch.

These reply to Parva's comments.
@Parva, if you want to discuss more about live migration, please reply 
in my thread, lets don't flood here.


On 2023/9/20 15:06, Zhu, Lingshan wrote:


On 9/20/2023 2:58 PM, Parav Pandit wrote:

From: Chen, Jiqian 
Sent: Wednesday, September 20, 2023 12:03 PM
If driver write 0 to reset device, can the SUSPEND bit be cleared?

It must as reset operation, resets everything else and so the suspend too.


(pci_pm_resume->virtio_pci_restore->virtio_device_restore-

virtio_reset_device)

If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if
the reset request is from guest restore process or not, and then I can't change
the reset behavior.

Reset should not be influenced by suspend.
Suspend should do the work of suspend and reset to do the reset.

The problem to overcome in [1] is, resume operation needs to be synchronous as 
it involves large part of context to resume back, and hence just asynchronously 
setting DRIVER_OK is not enough.
The sw must verify back that device has resumed the operation and ready to 
answer requests.

this is not live migration, all device status and other information still stay in the 
device, no need to "resume" context, just resume running.

Like resume from a failed LM.

This is slightly different flow than setting the DRIVER_OK for the first time 
device initialization sequence as it does not involve large restoration.

So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver should 
clear the SUSPEND bit and verify that it is out of SUSPEND.

Because driver is still in _OK_ driving the device flipping the SUSPEND bit.

Please read the spec, it says:
The driver MUST NOT clear a device status bit







Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-20 Thread Zhu, Lingshan




On 9/20/2023 3:10 PM, Parav Pandit wrote:

From: Zhu, Lingshan 
Sent: Wednesday, September 20, 2023 12:37 PM

The problem to overcome in [1] is, resume operation needs to be synchronous

as it involves large part of context to resume back, and hence just
asynchronously setting DRIVER_OK is not enough.

The sw must verify back that device has resumed the operation and ready to

answer requests.
this is not live migration, all device status and other information still stay 
in the
device, no need to "resume" context, just resume running.


I am aware that it is not live migration. :)

"Just resuming" involves lot of device setup task. The device implementation 
does not know for how long a device is suspended.
So for example, a VM is suspended for 6 hours, hence the device context could 
be saved in a slow disk.
Hence, when the resume is done, it needs to setup things again and driver got 
to verify before accessing more from the device.
The restore procedures should perform by the hypervisor and done before 
set DRIVER_OK and wake up the guest.

And the hypervisor/driver needs to check the device status by re-reading.
  

Like resume from a failed LM.

This is slightly different flow than setting the DRIVER_OK for the first time

device initialization sequence as it does not involve large restoration.

So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver

should clear the SUSPEND bit and verify that it is out of SUSPEND.

Because driver is still in _OK_ driving the device flipping the SUSPEND bit.

Please read the spec, it says:
The driver MUST NOT clear a device status bit


Yes, this is why either DRIER_OK validation by the driver is needed or Jiqian's 
synchronous new register..

so re-read







Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-20 Thread Zhu, Lingshan




On 9/20/2023 2:58 PM, Parav Pandit wrote:

From: Chen, Jiqian 
Sent: Wednesday, September 20, 2023 12:03 PM
If driver write 0 to reset device, can the SUSPEND bit be cleared?

It must as reset operation, resets everything else and so the suspend too.


(pci_pm_resume->virtio_pci_restore->virtio_device_restore-

virtio_reset_device)

If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if
the reset request is from guest restore process or not, and then I can't change
the reset behavior.

Reset should not be influenced by suspend.
Suspend should do the work of suspend and reset to do the reset.

The problem to overcome in [1] is, resume operation needs to be synchronous as 
it involves large part of context to resume back, and hence just asynchronously 
setting DRIVER_OK is not enough.
The sw must verify back that device has resumed the operation and ready to 
answer requests.
this is not live migration, all device status and other information 
still stay in the device, no need to "resume" context, just resume running.


Like resume from a failed LM.


This is slightly different flow than setting the DRIVER_OK for the first time 
device initialization sequence as it does not involve large restoration.

So, to merge two ideas, instead of doing DRIVER_OK to resume, the driver should 
clear the SUSPEND bit and verify that it is out of SUSPEND.

Because driver is still in _OK_ driving the device flipping the SUSPEND bit.

Please read the spec, it says:
The driver MUST NOT clear a device status bit





Re: [virtio-dev] Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-20 Thread Zhu, Lingshan




On 9/20/2023 2:33 PM, Chen, Jiqian wrote:

Hi Lingshan,

On 2023/9/20 13:59, Zhu, Lingshan wrote:


On 9/19/2023 8:31 PM, Michael S. Tsirkin wrote:

On Tue, Sep 19, 2023 at 07:42:42PM +0800, Jiqian Chen wrote:

When guest vm does S3, Qemu will reset and clear some things of virtio
devices, but guest can't aware that, so that may cause some problems.
For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
resume, that function will destroy render resources of virtio-gpu. As
a result, after guest resume, the display can't come back and we only
saw a black screen. Due to guest can't re-create all the resources, so
we need to let Qemu not to destroy them when S3.

For above purpose, we need a mechanism that allows guests and QEMU to
negotiate their reset behavior. So this patch add a new parameter
named freeze_mode to struct virtio_pci_common_cfg. And when guest
suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
devices can change their reset behavior on Qemu side according to
freeze_mode. What's more, freeze_mode can be used for all virtio
devices to affect the behavior of Qemu, not just virtio gpu device.

Hi Jiqian,

Have you seen this series: [PATCH 0/5] virtio: introduce SUSPEND bit and vq 
state
https://lore.kernel.org/all/3f4cbf84-010c-cffa-0b70-33c449b55...@intel.com/T/

We introduced a bit in the device status SUSPEND, when VIRTIO_F_SUSPEND is
negotiated, the driver can set SUSPEND in the device status to suspend the
device.

When SUSPEND, the device should pause its operations and preserve its 
configurations
in its configuration space.

The driver re-write DRIVER_OK to clear SUSPEND, so the device resumes running.

This is originally to serve live migration, but I think it can also meet your 
needs.

I noticed your series, but I am not sure they are also meet my needs.
If driver write 0 to reset device, can the SUSPEND bit be cleared? 
(pci_pm_resume->virtio_pci_restore->virtio_device_restore->virtio_reset_device)
if the driver writes 0, it resets all virtio functionalities. So SUSPEND 
is cleared.
device reset can also be used to recover the device from fatal errors, 
so it should reset everything in virtio.

If SUSPEND is cleared, then during the reset process in Qemu, I can't judge if 
the reset request is from guest restore process or not, and then I can't change 
the reset behavior.
I think when enter S3, the hypervisor/driver should set SUSPEND to the 
device. And when resume from S3, the hypervisor/driver should

re-write DRIVER_OK to clear SUSPEND, then the device resume running.

Can you send me your patch link on kernel and qemu side? I will take a deep 
look.

There are no patches for qemu/kernel yet, spec first.



Thanks,
Zhu Lingshan

Signed-off-by: Jiqian Chen 
---
   transport-pci.tex | 7 +++
   1 file changed, 7 insertions(+)

diff --git a/transport-pci.tex b/transport-pci.tex
index a5c6719..2543536 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
   le64 queue_desc;    /* read-write */
   le64 queue_driver;  /* read-write */
   le64 queue_device;  /* read-write */
+    le16 freeze_mode;   /* read-write */
   le16 queue_notif_config_data;   /* read-only for driver */
   le16 queue_reset;   /* read-write */


we can't add fields in the middle of the structure like this -
offset of queue_notif_config_data and queue_reset changes.

   

@@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
   \item[\field{queue_device}]
   The driver writes the physical address of Device Area here.  See 
section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues}.
   +\item[\field{freeze_mode}]
+    The driver writes this to set the freeze mode of virtio pci.
+    VIRTIO_PCI_FREEZE_MODE_UNFREEZE - virtio-pci is running;
+    VIRTIO_PCI_FREEZE_MODE_FREEZE_S3 - guest vm is doing S3, and 
virtio-pci enters S3 suspension;
+    Other values are reserved for future use, like S4, etc.
+

we need to specify these values then.

we also need
- feature bit to detect support for S3
- conformance statements documenting behavious under S3



   \item[\field{queue_notif_config_data}]
   This field exists only if VIRTIO_F_NOTIF_CONFIG_DATA has been 
negotiated.
   The driver will use this value when driver sends available buffer
--
2.34.1

This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscr...@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
List help: virtio-commen

Re: [virtio-comment] Re: [VIRTIO PCI PATCH v5 1/1] transport-pci: Add freeze_mode to virtio_pci_common_cfg

2023-09-19 Thread Zhu, Lingshan




On 9/19/2023 8:31 PM, Michael S. Tsirkin wrote:

On Tue, Sep 19, 2023 at 07:42:42PM +0800, Jiqian Chen wrote:

When guest vm does S3, Qemu will reset and clear some things of virtio
devices, but guest can't aware that, so that may cause some problems.
For excample, Qemu calls virtio_reset->virtio_gpu_gl_reset when guest
resume, that function will destroy render resources of virtio-gpu. As
a result, after guest resume, the display can't come back and we only
saw a black screen. Due to guest can't re-create all the resources, so
we need to let Qemu not to destroy them when S3.

For above purpose, we need a mechanism that allows guests and QEMU to
negotiate their reset behavior. So this patch add a new parameter
named freeze_mode to struct virtio_pci_common_cfg. And when guest
suspends, it can write freeze_mode to be FREEZE_S3, and then virtio
devices can change their reset behavior on Qemu side according to
freeze_mode. What's more, freeze_mode can be used for all virtio
devices to affect the behavior of Qemu, not just virtio gpu device.

Hi Jiqian,

Have you seen this series: [PATCH 0/5] virtio: introduce SUSPEND bit and 
vq state

https://lore.kernel.org/all/3f4cbf84-010c-cffa-0b70-33c449b55...@intel.com/T/

We introduced a bit in the device status SUSPEND, when VIRTIO_F_SUSPEND is
negotiated, the driver can set SUSPEND in the device status to suspend the
device.

When SUSPEND, the device should pause its operations and preserve its 
configurations

in its configuration space.

The driver re-write DRIVER_OK to clear SUSPEND, so the device resumes 
running.


This is originally to serve live migration, but I think it can also meet 
your needs.


Thanks,
Zhu Lingshan


Signed-off-by: Jiqian Chen 
---
  transport-pci.tex | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/transport-pci.tex b/transport-pci.tex
index a5c6719..2543536 100644
--- a/transport-pci.tex
+++ b/transport-pci.tex
@@ -319,6 +319,7 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
  le64 queue_desc;/* read-write */
  le64 queue_driver;  /* read-write */
  le64 queue_device;  /* read-write */
+le16 freeze_mode;   /* read-write */
  le16 queue_notif_config_data;   /* read-only for driver */
  le16 queue_reset;   /* read-write */


we can't add fields in the middle of the structure like this -
offset of queue_notif_config_data and queue_reset changes.

   

@@ -393,6 +394,12 @@ \subsubsection{Common configuration structure 
layout}\label{sec:Virtio Transport
  \item[\field{queue_device}]
  The driver writes the physical address of Device Area here.  See 
section \ref{sec:Basic Facilities of a Virtio Device / Virtqueues}.
  
+\item[\field{freeze_mode}]

+The driver writes this to set the freeze mode of virtio pci.
+VIRTIO_PCI_FREEZE_MODE_UNFREEZE - virtio-pci is running;
+VIRTIO_PCI_FREEZE_MODE_FREEZE_S3 - guest vm is doing S3, and 
virtio-pci enters S3 suspension;
+Other values are reserved for future use, like S4, etc.
+

we need to specify these values then.

we also need
- feature bit to detect support for S3
- conformance statements documenting behavious under S3



  \item[\field{queue_notif_config_data}]
  This field exists only if VIRTIO_F_NOTIF_CONFIG_DATA has been 
negotiated.
  The driver will use this value when driver sends available buffer
--
2.34.1


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscr...@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscr...@lists.oasis-open.org
List help: virtio-comment-h...@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/






Re: [PATCH] vhost: disable VHOST_OPS_DEBUG by default

2023-07-18 Thread Zhu, Lingshan




On 7/17/2023 6:14 PM, Philippe Mathieu-Daudé wrote:

Hi,

On 17/7/23 19:44, Zhu Lingshan wrote:

This commit disables VHOST_OPS_DEBUG by default
These information are ususally only required in development
environment

Signed-off-by: Zhu Lingshan 
---
  hw/virtio/vhost.c | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 82394331bf..ec435a3079 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -29,7 +29,9 @@
  #include "trace.h"
    /* enabled until disconnected backend stabilizes */
-#define _VHOST_DEBUG 1
+
+/* uncomment macro _VHOST_DEBUG to enable VHOST_OPS_DEBUG */
+/* #define _VHOST_DEBUG 1 */


Since you are looking at this, it would be more useful to
convert VHOST_OPS_DEBUG() to trace events (see for example
commit 163b8663b8 and other "Convert DPRINTF() debug macro
to trace events" commits).

Thanks Phil, I will look into this


Regards,

Phil.






[PATCH] vhost: disable VHOST_OPS_DEBUG by default

2023-07-17 Thread Zhu Lingshan
This commit disables VHOST_OPS_DEBUG by default
These information are ususally only required in development
environment

Signed-off-by: Zhu Lingshan 
---
 hw/virtio/vhost.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 82394331bf..ec435a3079 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -29,7 +29,9 @@
 #include "trace.h"
 
 /* enabled until disconnected backend stabilizes */
-#define _VHOST_DEBUG 1
+
+/* uncomment macro _VHOST_DEBUG to enable VHOST_OPS_DEBUG */
+/* #define _VHOST_DEBUG 1 */
 
 #ifdef _VHOST_DEBUG
 #define VHOST_OPS_DEBUG(retval, fmt, ...) \
@@ -1318,7 +1320,10 @@ static void vhost_virtqueue_error_notifier(EventNotifier 
*n)
 struct vhost_virtqueue *vq = container_of(n, struct vhost_virtqueue,
   error_notifier);
 struct vhost_dev *dev = vq->dev;
+
+#ifdef _VHOST_DEBUG
 int index = vq - dev->vqs;
+#endif
 
 if (event_notifier_test_and_clear(n) && dev->vdev) {
 VHOST_OPS_DEBUG(-EINVAL,  "vhost vring error in virtqueue %d",
-- 
2.39.3




Re: [PATCH V2] vhost_vdpa: no need to fetch vring base when poweroff

2023-07-12 Thread Zhu, Lingshan



On 7/12/2023 3:22 PM, Jason Wang wrote:



On Wed, Jul 12, 2023 at 2:54 PM Zhu, Lingshan  
wrote:




On 7/11/2023 3:34 PM, Jason Wang wrote:



On Tue, Jul 11, 2023 at 3:25 PM Eugenio Perez Martin
 wrote:

On Tue, Jul 11, 2023 at 9:05 AM Jason Wang
 wrote:
>
> On Tue, Jul 11, 2023 at 12:09 PM Zhu, Lingshan
 wrote:
> >
> >
> >
> > On 7/11/2023 10:50 AM, Jason Wang wrote:
> > > On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan
 wrote:
> > >> In the poweroff routine, no need to fetch last
available index.
> > >>
> > > This is because there's no concept of shutdown in the
vhost layer, it
> > > only knows start and stop.
> > >
> > >> This commit also provides a better debug message in
the vhost
> > >> caller vhost_virtqueue_stop,
> > > A separate patch is better.
> > OK
> > >
> > >> because if vhost does not fetch
> > >> the last avail idx successfully, maybe the device does not
> > >> suspend, vhost will sync last avail idx to vring used
    idx as a
> > >> work around, not a failure.
> > > This only happens if we return a negative value?
> > Yes
> > >
> > >> Signed-off-by: Zhu Lingshan 
> > >> ---
> > >>   hw/virtio/vhost-vdpa.c | 10 ++
> > >>   hw/virtio/vhost.c      |  2 +-
> > >>   2 files changed, 11 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/hw/virtio/vhost-vdpa.c
b/hw/virtio/vhost-vdpa.c
> > >> index 3c575a9a6e..10b445f64e 100644
> > >> --- a/hw/virtio/vhost-vdpa.c
> > >> +++ b/hw/virtio/vhost-vdpa.c
> > >> @@ -26,6 +26,7 @@
> > >>   #include "cpu.h"
> > >>   #include "trace.h"
> > >>   #include "qapi/error.h"
> > >> +#include "sysemu/runstate.h"
> > >>
> > >>   /*
> > >>    * Return one past the end of the end of section. Be
careful with uint64_t
> > >> @@ -1391,6 +1392,15 @@ static int
vhost_vdpa_get_vring_base(struct vhost_dev *dev,
> > >>       struct vhost_vdpa *v = dev->opaque;
> > >>       int ret;
> > >>
> > >> +    if (runstate_check(RUN_STATE_SHUTDOWN)) {
> > >> +        /*
> > >> +         * Some devices do not support this call
properly,
> > >> +         * and we don't need to retrieve the indexes
> > >> +         * if it is shutting down
> > >> +         */
> > >> +        return 0;
> > > Checking runstate in the vhost code seems like a layer
violation.
> > >
> > > What happens without this patch?
> > vhost tries to fetch vring base,
> > vhost_vdpa needs suspend the device before retrieving
last_avail_idx.
> > However not all devices can support .suspend properly so
this call
> > may fail.
>
> I think this is where I'm lost. If the device doesn't support
> suspending, any reason we only try to fix the case of shutdown?
>
> Btw, the fail is intended:
>
>     if (!v->suspended) {
>         /*
>          * Cannot trust in value returned by device, let
vhost recover used
>          * idx from guest.
>          */
>         return -1;
>     }
>

The fail is intended, but to recover the last used idx,
either from
device or from the guest, is only useful in the case of
migration.


Note that we had userspace devices like VDUSE now, so it could be
useful in the case of a VDUSE daemon crash or reconnect.

This code blcok is for vhost_vdpa backend, and I think vduse is
another code path.


I'm not sure I understand here, I meant vhost_vdpa + vduse. It works 
similar to vhost-user.
OK, so do you suggest we set vring state == 0 and return 0 if failed to 
suspend the device?

No matter shutdown or other cases.


Return a guest used idx may be a good idea but as Eugenio pointed
out that may duplicate the

Re: [PATCH V2] vhost_vdpa: no need to fetch vring base when poweroff

2023-07-11 Thread Zhu, Lingshan



On 7/11/2023 3:34 PM, Jason Wang wrote:



On Tue, Jul 11, 2023 at 3:25 PM Eugenio Perez Martin 
 wrote:


On Tue, Jul 11, 2023 at 9:05 AM Jason Wang 
wrote:
>
> On Tue, Jul 11, 2023 at 12:09 PM Zhu, Lingshan
 wrote:
> >
> >
> >
> > On 7/11/2023 10:50 AM, Jason Wang wrote:
> > > On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan
 wrote:
> > >> In the poweroff routine, no need to fetch last available index.
> > >>
> > > This is because there's no concept of shutdown in the vhost
layer, it
> > > only knows start and stop.
> > >
> > >> This commit also provides a better debug message in the vhost
> > >> caller vhost_virtqueue_stop,
> > > A separate patch is better.
> > OK
> > >
> > >> because if vhost does not fetch
> > >> the last avail idx successfully, maybe the device does not
> > >> suspend, vhost will sync last avail idx to vring used idx as a
> > >> work around, not a failure.
> > > This only happens if we return a negative value?
> > Yes
> > >
> > >> Signed-off-by: Zhu Lingshan 
> > >> ---
> > >>   hw/virtio/vhost-vdpa.c | 10 ++
> > >>   hw/virtio/vhost.c      |  2 +-
> > >>   2 files changed, 11 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > >> index 3c575a9a6e..10b445f64e 100644
> > >> --- a/hw/virtio/vhost-vdpa.c
> > >> +++ b/hw/virtio/vhost-vdpa.c
> > >> @@ -26,6 +26,7 @@
> > >>   #include "cpu.h"
> > >>   #include "trace.h"
> > >>   #include "qapi/error.h"
> > >> +#include "sysemu/runstate.h"
> > >>
> > >>   /*
> > >>    * Return one past the end of the end of section. Be
careful with uint64_t
> > >> @@ -1391,6 +1392,15 @@ static int
vhost_vdpa_get_vring_base(struct vhost_dev *dev,
> > >>       struct vhost_vdpa *v = dev->opaque;
> > >>       int ret;
> > >>
> > >> +    if (runstate_check(RUN_STATE_SHUTDOWN)) {
> > >> +        /*
> > >> +         * Some devices do not support this call properly,
> > >> +         * and we don't need to retrieve the indexes
> > >> +         * if it is shutting down
> > >> +         */
> > >> +        return 0;
> > > Checking runstate in the vhost code seems like a layer
violation.
> > >
> > > What happens without this patch?
> > vhost tries to fetch vring base,
> > vhost_vdpa needs suspend the device before retrieving
last_avail_idx.
> > However not all devices can support .suspend properly so this call
> > may fail.
>
> I think this is where I'm lost. If the device doesn't support
> suspending, any reason we only try to fix the case of shutdown?
>
> Btw, the fail is intended:
>
>     if (!v->suspended) {
>         /*
>          * Cannot trust in value returned by device, let vhost
recover used
>          * idx from guest.
>          */
>         return -1;
>     }
>

The fail is intended, but to recover the last used idx, either from
device or from the guest, is only useful in the case of migration.


Note that we had userspace devices like VDUSE now, so it could be 
useful in the case of a VDUSE daemon crash or reconnect.
This code blcok is for vhost_vdpa backend, and I think vduse is another 
code path.
Return a guest used idx may be a good idea but as Eugenio pointed out 
that may duplicate the code.



I think the main problem is the error message, actually. But I think
it has no use either to recover last_avail_idx or print a debug
message if we're not migrating. Another solution would be to recover
it from the guest at vhost_vdpa_get_vring_base, but I don't like the
duplication.

> And if we return to success here, will we go to set an uninitialized
> last avail idx?
>

It will be either the default value (is set to 0 at
__virtio_queue_reset) or the one received from a migration (at
virtio_load).


0 is even sub-optimal than the index used. Anyhow, VHOST_DEBUG should 
not be enabled for production environments.
Returning 0 sounds like a queue reset, yes we can reset a queue

Re: [PATCH V2] vhost_vdpa: no need to fetch vring base when poweroff

2023-07-10 Thread Zhu, Lingshan




On 7/11/2023 10:50 AM, Jason Wang wrote:

On Mon, Jul 10, 2023 at 4:53 PM Zhu Lingshan  wrote:

In the poweroff routine, no need to fetch last available index.


This is because there's no concept of shutdown in the vhost layer, it
only knows start and stop.


This commit also provides a better debug message in the vhost
caller vhost_virtqueue_stop,

A separate patch is better.

OK



because if vhost does not fetch
the last avail idx successfully, maybe the device does not
suspend, vhost will sync last avail idx to vring used idx as a
work around, not a failure.

This only happens if we return a negative value?

Yes



Signed-off-by: Zhu Lingshan 
---
  hw/virtio/vhost-vdpa.c | 10 ++
  hw/virtio/vhost.c  |  2 +-
  2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 3c575a9a6e..10b445f64e 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -26,6 +26,7 @@
  #include "cpu.h"
  #include "trace.h"
  #include "qapi/error.h"
+#include "sysemu/runstate.h"

  /*
   * Return one past the end of the end of section. Be careful with uint64_t
@@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev 
*dev,
  struct vhost_vdpa *v = dev->opaque;
  int ret;

+if (runstate_check(RUN_STATE_SHUTDOWN)) {
+/*
+ * Some devices do not support this call properly,
+ * and we don't need to retrieve the indexes
+ * if it is shutting down
+ */
+return 0;

Checking runstate in the vhost code seems like a layer violation.

What happens without this patch?

vhost tries to fetch vring base,
vhost_vdpa needs suspend the device before retrieving last_avail_idx.
However not all devices can support .suspend properly so this call
may fail. Then vhost will print an error shows something failed.

The error msg is confused, as stated in the commit log, restoring 
last_avail_idx with guest used idx

is a workaround rather than a failure. And no needs to fetch last_avail_idx
when power off.

Thanks


Thanks


+}
+
  if (v->shadow_vqs_enabled) {
  ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index);
  return 0;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 82394331bf..7dd90cff3a 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,

  r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
  if (r < 0) {
-VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
+VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used idx for vhost VQ 
%u", idx);
  /* Connection to the backend is broken, so let's sync internal
   * last avail idx to the device used idx.
   */
--
2.39.3






[PATCH V2] vhost_vdpa: no need to fetch vring base when poweroff

2023-07-10 Thread Zhu Lingshan
In the poweroff routine, no need to fetch last available index.

This commit also provides a better debug message in the vhost
caller vhost_virtqueue_stop, because if vhost does not fetch
the last avail idx successfully, maybe the device does not
suspend, vhost will sync last avail idx to vring used idx as a
work around, not a failure.

Signed-off-by: Zhu Lingshan 
---
 hw/virtio/vhost-vdpa.c | 10 ++
 hw/virtio/vhost.c  |  2 +-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 3c575a9a6e..10b445f64e 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -26,6 +26,7 @@
 #include "cpu.h"
 #include "trace.h"
 #include "qapi/error.h"
+#include "sysemu/runstate.h"
 
 /*
  * Return one past the end of the end of section. Be careful with uint64_t
@@ -1391,6 +1392,15 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev 
*dev,
 struct vhost_vdpa *v = dev->opaque;
 int ret;
 
+if (runstate_check(RUN_STATE_SHUTDOWN)) {
+/*
+ * Some devices do not support this call properly,
+ * and we don't need to retrieve the indexes
+ * if it is shutting down
+ */
+return 0;
+}
+
 if (v->shadow_vqs_enabled) {
 ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index);
 return 0;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 82394331bf..7dd90cff3a 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
 
 r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
 if (r < 0) {
-VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
+VHOST_OPS_DEBUG(r, "sync last avail idx to the guest used idx for 
vhost VQ %u", idx);
 /* Connection to the backend is broken, so let's sync internal
  * last avail idx to the device used idx.
  */
-- 
2.39.3




Re: [PATCH] vhost_vdpa: no need to fetch vring base when poweroff

2023-07-09 Thread Zhu, Lingshan




On 7/7/2023 7:02 PM, Eugenio Perez Martin wrote:

On Fri, Jul 7, 2023 at 12:18 PM Zhu Lingshan  wrote:

In the poweroff routine, no need to fetch last available index.

This commit also provides a better debug message in the vhost
caller vhost_virtqueue_stop, because if vhost does not fetch
the last avail idx successfully, maybe the device does not
suspend, vhost will sync last avail idx to vring used idx as a
work around, not a failure.

Signed-off-by: Zhu Lingshan 

CCing MST.


---
  hw/virtio/vhost-vdpa.c | 4 
  hw/virtio/vhost.c  | 2 +-
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 3c575a9a6e..f62952e1c7 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -26,6 +26,7 @@
  #include "cpu.h"
  #include "trace.h"
  #include "qapi/error.h"
+#include "sysemu/runstate.h"

  /*
   * Return one past the end of the end of section. Be careful with uint64_t
@@ -1391,6 +1392,9 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev 
*dev,
  struct vhost_vdpa *v = dev->opaque;
  int ret;

+if (runstate_check(RUN_STATE_SHUTDOWN))
+return 0;
+

QEMU coding style mandates braces around the "if" body (CODING_STYLE.rst file).

Apart from that I think we should add a comment here. Something in the line of:

Some devices do not support the call properly, and we don't need to
retrieve the indexes if we're not migrating. Skip it in this case.

V2 will include these changes, thanks!



  if (v->shadow_vqs_enabled) {
  ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index);
  return 0;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 82394331bf..84712743e0 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,

  r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
  if (r < 0) {
-VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
+VHOST_OPS_DEBUG(r, "sync last avail idx to the used idx for vhost VQ 
%u", idx);

Guest's used idx? Also, maybe it is worth splitting this in a separated patch.
Yes, guest used idx is much clearer. I am not sure whether this worthy a 
separated

patch, this is only a comment.


Apart from the nitpicking, I think the general line of the patch is
the way to go :).

Thanks!


Thanks!


  /* Connection to the backend is broken, so let's sync internal
   * last avail idx to the device used idx.
   */
--
2.39.3






[PATCH] vhost_vdpa: no need to fetch vring base when poweroff

2023-07-07 Thread Zhu Lingshan
In the poweroff routine, no need to fetch last available index.

This commit also provides a better debug message in the vhost
caller vhost_virtqueue_stop, because if vhost does not fetch
the last avail idx successfully, maybe the device does not
suspend, vhost will sync last avail idx to vring used idx as a
work around, not a failure.

Signed-off-by: Zhu Lingshan 
---
 hw/virtio/vhost-vdpa.c | 4 
 hw/virtio/vhost.c  | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 3c575a9a6e..f62952e1c7 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -26,6 +26,7 @@
 #include "cpu.h"
 #include "trace.h"
 #include "qapi/error.h"
+#include "sysemu/runstate.h"
 
 /*
  * Return one past the end of the end of section. Be careful with uint64_t
@@ -1391,6 +1392,9 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev 
*dev,
 struct vhost_vdpa *v = dev->opaque;
 int ret;
 
+if (runstate_check(RUN_STATE_SHUTDOWN))
+return 0;
+
 if (v->shadow_vqs_enabled) {
 ring->num = virtio_queue_get_last_avail_idx(dev->vdev, ring->index);
 return 0;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 82394331bf..84712743e0 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -1262,7 +1262,7 @@ void vhost_virtqueue_stop(struct vhost_dev *dev,
 
 r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
 if (r < 0) {
-VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
+VHOST_OPS_DEBUG(r, "sync last avail idx to the used idx for vhost VQ 
%u", idx);
 /* Connection to the backend is broken, so let's sync internal
  * last avail idx to the device used idx.
  */
-- 
2.39.3




Re: [PATCH] vdpa: dont check vhost_vdpa->suspended when unsupported

2023-06-09 Thread Zhu, Lingshan




On 6/9/2023 11:13 AM, Jason Wang wrote:

On Wed, Jun 7, 2023 at 11:37 PM Eugenio Perez Martin
 wrote:

On Wed, Jun 7, 2023 at 11:09 AM Zhu Lingshan  wrote:

When read the state of a virtqueue, vhost_vdpa need
to check whether the device is suspended.

This commit verifies whether VHOST_BACKEND_F_SUSPEND is
negotiated when checking vhost_vdpa->suspended.


I'll add: Otherwise, qemu prints XXX error message.

On second thought, not returning an error prevents the caller
(vhost.c:vhost_virtqueue_stop) from recovering used idx from the
guest.

I'm not sure about the right fix for this, should we call
virtio_queue_restore_last_avail_idx here?

It should be the duty of the caller:

E.g in vhost_virtqueue_stop() we had:

=>  r = dev->vhost_ops->vhost_get_vring_base(dev, &state);
 if (r < 0) {
 VHOST_OPS_DEBUG(r, "vhost VQ %u ring restore failed: %d", idx, r);
 /* Connection to the backend is broken, so let's sync internal
  * last avail idx to the device used idx.
  */
virtio_queue_restore_last_avail_idx(vdev, idx);
 } else {
 virtio_queue_set_last_avail_idx(vdev, idx, state.num);

I agree call virtio_queue_restore_last_avail_idx here in the caller is
better. However I found this debug message is confused:

_get_vring_base() read the queue state, it does not restore anything,
and if failed to read the queue state, it invokes 
virtio_queue_restore_last_avail_idx()

to recover, still works, handled the problem internally.

So how about we remove this debug message or change it to:
restore last_avail_idx of vhost vq N from vring used_idx.

Thanks


Thansk


Communicate that the backend
cannot suspend so vhost.c can print a better error message?

Thanks!


Signed-off-by: Zhu Lingshan 
---
  hw/virtio/vhost-vdpa.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index b3094e8a8b..ae176c06dd 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1397,7 +1397,7 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev 
*dev,
  return 0;
  }

-if (!v->suspended) {
+if ((dev->backend_cap & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && 
(!v->suspended)) {
  /*
   * Cannot trust in value returned by device, let vhost recover used
   * idx from guest.
--
2.39.1






Re: [PATCH] vdpa: dont check vhost_vdpa->suspended when unsupported

2023-06-08 Thread Zhu, Lingshan




On 6/8/2023 6:59 PM, Eugenio Perez Martin wrote:

On Thu, Jun 8, 2023 at 11:34 AM Zhu, Lingshan  wrote:



On 6/8/2023 4:49 PM, Eugenio Perez Martin wrote:

On Thu, Jun 8, 2023 at 9:07 AM Zhu, Lingshan  wrote:


On 6/7/2023 9:51 PM, Eugenio Perez Martin wrote:

On Wed, Jun 7, 2023 at 11:09 AM Zhu Lingshan  wrote:

When read the state of a virtqueue, vhost_vdpa need
to check whether the device is suspended.

This commit verifies whether VHOST_BACKEND_F_SUSPEND is
negotiated when checking vhost_vdpa->suspended.


I'll add: Otherwise, qemu prints XXX error message.

On second thought, not returning an error prevents the caller
(vhost.c:vhost_virtqueue_stop) from recovering used idx from the
guest.

I'm not sure about the right fix for this, should we call
virtio_queue_restore_last_avail_idx here? Communicate that the backend
cannot suspend so vhost.c can print a better error message?

I agree it is better not to return an error.

Instead if neither the device does not support _F_SUSPEND nor failed to
suspend,
I think vhost_vdpa should try to read the last_avail_idx from
the device anyway. We can print an error message here,
like: failed to suspend, the value may not reliable.


We need to drop that value if it is not reliable. If the device keeps
using descriptors, used_idx may increase beyond avail_idx, so the
usual is to just restore whatever used_idx value the guest had at
stop.

This interface is used to read the last_avail_idx, the ideal case
is to fetch it after device has been suspended.

If the device is not suspended, the value may be unreliable
because the device may keep consuming descriptors.
However at that moment, the guest may be freezed as well,
so the guest wouldn't supply more available descriptors to the device.


Actually, if we cannot suspend the device we reset it, as we cannot
afford to modify used_idx.

I'm thinking now that your original idea was way better to be honest.
To not call vhost_get_vring_base in case the VM is shutting down and
we're not migrating seems to fit the situation way better. We save an
ioctl / potential device call for nothing.

agree, but I failed to find a code patch indicting the VM is in
"shutting down" status, and it may be overkill to add
a new status field which only used here.

So maybe a fix like this:
1) if the device does not support _F_SUSPEND:
call virtio_queue_restore_last_avail_idx(dev, state->index)
2) if the device support _F_SUSPEND but failed to suspend:
return -1 and vhost_virtqueue_stop() will call 
virtio_queue_restore_last_avail_idx()

in the original code path.

Does this look good to you?



I think that means the last_avail_idx may not be reliable but still safe,
better than read nothing. Because the last_avail_idx always lags behind
the in-guest avail_idx.


Assuming we're migrating and we don't either reset or suspend the device
* guest set avail_idx=3
* device fetch last_avail_idx=3
* guest set avail_idx=6
* VM_SUSPEND
* we call unreliable get_vring_base, and obtain device state 3.
* device is not reset, so it reads guest's avail_idx = 6. It can
process the descriptors in avail_idx position 3, 4 and 5, and write
that used_idx.
* virtio_load detects that inconsistency, as (uint)last_avail_idx -
(uint)used_idx > vring size.

So I think we need to keep printing an error and recovery from the
guest as a fallback.

so true, I think that is virtio_queue_restore_last_avail_idx()



I am not sure we can restore last_avail_idx with last_used_idx, there is
always
a delta between them.


Yes, we assume it is safe to process these descriptors again or that
they'll be migrated when that is supported.

In any case, if a device does not work with this, we should be more
restrictive, not less.

Does it make sense to you?
Yes, so if the device doesn't support _F_SUSPEND, we call 
virtio_queue_restore_last_avail_idx()

in vhost_vdpa_get_vring_base(), look good?

Thanks


Thanks!


Thanks


Thanks!


Thanks

Thanks!


Signed-off-by: Zhu Lingshan 
---
hw/virtio/vhost-vdpa.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index b3094e8a8b..ae176c06dd 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1397,7 +1397,7 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev 
*dev,
return 0;
}

-if (!v->suspended) {
+if ((dev->backend_cap & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && 
(!v->suspended)) {
/*
 * Cannot trust in value returned by device, let vhost recover used
 * idx from guest.
--
2.39.1






Re: [PATCH] vdpa: dont check vhost_vdpa->suspended when unsupported

2023-06-08 Thread Zhu, Lingshan




On 6/8/2023 4:49 PM, Eugenio Perez Martin wrote:

On Thu, Jun 8, 2023 at 9:07 AM Zhu, Lingshan  wrote:



On 6/7/2023 9:51 PM, Eugenio Perez Martin wrote:

On Wed, Jun 7, 2023 at 11:09 AM Zhu Lingshan  wrote:

When read the state of a virtqueue, vhost_vdpa need
to check whether the device is suspended.

This commit verifies whether VHOST_BACKEND_F_SUSPEND is
negotiated when checking vhost_vdpa->suspended.


I'll add: Otherwise, qemu prints XXX error message.

On second thought, not returning an error prevents the caller
(vhost.c:vhost_virtqueue_stop) from recovering used idx from the
guest.

I'm not sure about the right fix for this, should we call
virtio_queue_restore_last_avail_idx here? Communicate that the backend
cannot suspend so vhost.c can print a better error message?

I agree it is better not to return an error.

Instead if neither the device does not support _F_SUSPEND nor failed to
suspend,
I think vhost_vdpa should try to read the last_avail_idx from
the device anyway. We can print an error message here,
like: failed to suspend, the value may not reliable.


We need to drop that value if it is not reliable. If the device keeps
using descriptors, used_idx may increase beyond avail_idx, so the
usual is to just restore whatever used_idx value the guest had at
stop.

This interface is used to read the last_avail_idx, the ideal case
is to fetch it after device has been suspended.

If the device is not suspended, the value may be unreliable
because the device may keep consuming descriptors.
However at that moment, the guest may be freezed as well,
so the guest wouldn't supply more available descriptors to the device.

I think that means the last_avail_idx may not be reliable but still safe,
better than read nothing. Because the last_avail_idx always lags behind
the in-guest avail_idx.

I am not sure we can restore last_avail_idx with last_used_idx, there is 
always

a delta between them.

Thanks


Thanks!


Thanks

Thanks!


Signed-off-by: Zhu Lingshan 
---
   hw/virtio/vhost-vdpa.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index b3094e8a8b..ae176c06dd 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1397,7 +1397,7 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev 
*dev,
   return 0;
   }

-if (!v->suspended) {
+if ((dev->backend_cap & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && 
(!v->suspended)) {
   /*
* Cannot trust in value returned by device, let vhost recover used
* idx from guest.
--
2.39.1






Re: [PATCH] vdpa: dont check vhost_vdpa->suspended when unsupported

2023-06-08 Thread Zhu, Lingshan




On 6/7/2023 9:51 PM, Eugenio Perez Martin wrote:

On Wed, Jun 7, 2023 at 11:09 AM Zhu Lingshan  wrote:

When read the state of a virtqueue, vhost_vdpa need
to check whether the device is suspended.

This commit verifies whether VHOST_BACKEND_F_SUSPEND is
negotiated when checking vhost_vdpa->suspended.


I'll add: Otherwise, qemu prints XXX error message.

On second thought, not returning an error prevents the caller
(vhost.c:vhost_virtqueue_stop) from recovering used idx from the
guest.

I'm not sure about the right fix for this, should we call
virtio_queue_restore_last_avail_idx here? Communicate that the backend
cannot suspend so vhost.c can print a better error message?

I agree it is better not to return an error.

Instead if neither the device does not support _F_SUSPEND nor failed to 
suspend,

I think vhost_vdpa should try to read the last_avail_idx from
the device anyway. We can print an error message here,
like: failed to suspend, the value may not reliable.

Thanks


Thanks!


Signed-off-by: Zhu Lingshan 
---
  hw/virtio/vhost-vdpa.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index b3094e8a8b..ae176c06dd 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1397,7 +1397,7 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev 
*dev,
  return 0;
  }

-if (!v->suspended) {
+if ((dev->backend_cap & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && 
(!v->suspended)) {
  /*
   * Cannot trust in value returned by device, let vhost recover used
   * idx from guest.
--
2.39.1






[PATCH] vdpa: dont check vhost_vdpa->suspended when unsupported

2023-06-07 Thread Zhu Lingshan
When read the state of a virtqueue, vhost_vdpa need
to check whether the device is suspended.

This commit verifies whether VHOST_BACKEND_F_SUSPEND is
negotiated when checking vhost_vdpa->suspended.

Signed-off-by: Zhu Lingshan 
---
 hw/virtio/vhost-vdpa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index b3094e8a8b..ae176c06dd 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -1397,7 +1397,7 @@ static int vhost_vdpa_get_vring_base(struct vhost_dev 
*dev,
 return 0;
 }
 
-if (!v->suspended) {
+if ((dev->backend_cap & BIT_ULL(VHOST_BACKEND_F_SUSPEND)) && 
(!v->suspended)) {
 /*
  * Cannot trust in value returned by device, let vhost recover used
  * idx from guest.
-- 
2.39.1




Re: [RFC v2 12/13] vdpa: preemptive kick at enable

2023-01-13 Thread Zhu, Lingshan




On 1/13/2023 10:31 AM, Jason Wang wrote:

On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez  wrote:

Spuriously kick the destination device's queue so it knows in case there
are new descriptors.

RFC: This is somehow a gray area. The guest may have placed descriptors
in a virtqueue but not kicked it, so it might be surprised if the device
starts processing it.

So I think this is kind of the work of the vDPA parent. For the parent
that needs this trick, we should do it in the parent driver.

Agree, it looks easier implementing this in parent driver,
I can implement it in ifcvf set_vq_ready right now

Thanks
Zhu Lingshan


Thanks


However, that information is not in the migration stream and it should
be an edge case anyhow, being resilient to parallel notifications from
the guest.

Signed-off-by: Eugenio Pérez 
---
  hw/virtio/vhost-vdpa.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 40b7e8706a..dff94355dd 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev 
*dev, int ready)
  }
  trace_vhost_vdpa_set_vring_ready(dev);
  for (i = 0; i < dev->nvqs; ++i) {
+VirtQueue *vq;
  struct vhost_vring_state state = {
  .index = dev->vq_index + i,
  .num = 1,
  };
  vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
+
+/* Preemptive kick */
+vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
+event_notifier_set(virtio_queue_get_host_notifier(vq));
  }
  return 0;
  }
--
2.31.1






[Qemu-devel] [PATCH] send readcapacity10 when readcapacity16 failed

2015-12-28 Thread Zhu Lingshan
When play with Dell MD3000 target, for sure it
is a TYPE_DISK, but readcapacity16 would fail.
Then we find that readcapacity10 succeeded. It
looks like the target just support readcapacity10
even through it is a TYPE_DISK or have some
TYPE_ROM characteristics.

This patch can give a chance to send
readcapacity16 when readcapacity10 failed.
This patch is not harmful to original pathes

Signed-off-by: Zhu Lingshan 
---
 block/iscsi.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index bd1f1bf..c8d167f 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1243,8 +1243,9 @@ static void iscsi_readcapacity_sync(IscsiLun *iscsilun, 
Error **errp)
 iscsilun->lbprz = !!rc16->lbprz;
 iscsilun->use_16_for_rw = (rc16->returned_lba > 
0x);
 }
+break;
 }
-break;
+//fall through to try readcapacity10 instead
 case TYPE_ROM:
 task = iscsi_readcapacity10_sync(iscsilun->iscsi, iscsilun->lun, 
0, 0);
 if (task != NULL && task->status == SCSI_STATUS_GOOD) {
-- 
2.6.2




[Qemu-devel] [PATCH] iscsi: fix readcapacity error message

2015-12-15 Thread Zhu Lingshan
fix:The error message for readcapacity 16 incorrectly mentioned
a readcapacity 10 failure, fixed the error message.

Signed-off-by: Zhu Lingshan 
---
 block/iscsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index bd1f1bf..eb28ddc 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1270,7 +1270,7 @@ static void iscsi_readcapacity_sync(IscsiLun *iscsilun, 
Error **errp)
  && retries-- > 0);
 
 if (task == NULL || task->status != SCSI_STATUS_GOOD) {
-error_setg(errp, "iSCSI: failed to send readcapacity10 command.");
+error_setg(errp, "iSCSI: failed to send readcapacity10/16 command");
 } else if (!iscsilun->block_size ||
iscsilun->block_size % BDRV_SECTOR_SIZE) {
 error_setg(errp, "iSCSI: the target returned an invalid "
-- 
2.6.2




Re: [Qemu-devel] [Qemu-block] [PATCH] iscsi: fix readcapacity error message

2015-12-15 Thread Zhu Lingshan

OK, thanks, I have sent the patch again and CC them

On 12/16/2015 04:20 AM, John Snow wrote:

CC qemu-trivial, I think this can go in through that tree.

On 12/14/2015 10:33 PM, Zhu Lingshan wrote:

fix:The error message for readcapacity 16 incorrectly mentioned
a readcapacity 10 failure, fixed the error message.

Signed-off-by: Zhu Lingshan 
---
  block/iscsi.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index bd1f1bf..eb28ddc 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1270,7 +1270,7 @@ static void iscsi_readcapacity_sync(IscsiLun *iscsilun, 
Error **errp)
   && retries-- > 0);
  
  if (task == NULL || task->status != SCSI_STATUS_GOOD) {

-error_setg(errp, "iSCSI: failed to send readcapacity10 command.");
+error_setg(errp, "iSCSI: failed to send readcapacity10/16 command");
  } else if (!iscsilun->block_size ||
 iscsilun->block_size % BDRV_SECTOR_SIZE) {
  error_setg(errp, "iSCSI: the target returned an invalid "


Reviewed-by: John Snow 






[Qemu-devel] [PATCH] iscsi: fix readcapacity error message

2015-12-15 Thread Zhu Lingshan
fix:The error message for readcapacity 16 incorrectly mentioned
a readcapacity 10 failure, fixed the error message.

Signed-off-by: Zhu Lingshan 
---
 block/iscsi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index bd1f1bf..eb28ddc 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1270,7 +1270,7 @@ static void iscsi_readcapacity_sync(IscsiLun *iscsilun, 
Error **errp)
  && retries-- > 0);
 
 if (task == NULL || task->status != SCSI_STATUS_GOOD) {
-error_setg(errp, "iSCSI: failed to send readcapacity10 command.");
+error_setg(errp, "iSCSI: failed to send readcapacity10/16 command");
 } else if (!iscsilun->block_size ||
iscsilun->block_size % BDRV_SECTOR_SIZE) {
 error_setg(errp, "iSCSI: the target returned an invalid "
-- 
2.6.2




Re: [Qemu-devel] [PATCH] fix:readcapacity 10 failure shown even 16 sent

2015-12-10 Thread Zhu Lingshan

Yeah, I assume that is an easy and efficient solution

Thanks

On 12/10/2015 04:57 PM, Peter Lieven wrote:

Am 10.12.2015 um 09:55 schrieb Paolo Bonzini:


On 10/12/2015 03:59, Zhu Lingshan wrote:

-if (task == NULL || task->status != SCSI_STATUS_GOOD) {
+if ((rc16 != NULL) && ((task == NULL) || (task->status != 
SCSI_STATUS_GOOD))) {
+error_setg(errp, "iSCSI: failed to send readcapacity16 
command.");

+}

You need an "else" here.


rc16 can't be not NULL if task is NULL.

I would not spent to much energie here and just adjust the error 
message to


"iSCSI: failed to send readcapacity10/16 command."

Peter








[Qemu-devel] [PATCH] fix:readcapacity 10 failure shown even 16 sent

2015-12-10 Thread Zhu Lingshan
This Patch would fix a bug that readcapacity10
failuare would be shown no matter readcapacy10
or readcapacity16 actually sent.

Signed-off-by: Zhu Lingshan 
---
 block/iscsi.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index bd1f1bf..6425cf4 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -1269,7 +1269,10 @@ static void iscsi_readcapacity_sync(IscsiLun *iscsilun, 
Error **errp)
  && task->sense.key == SCSI_SENSE_UNIT_ATTENTION
  && retries-- > 0);
 
-if (task == NULL || task->status != SCSI_STATUS_GOOD) {
+if ((rc16 != NULL) && ((task == NULL) || (task->status != 
SCSI_STATUS_GOOD))) {
+error_setg(errp, "iSCSI: failed to send readcapacity16 command.");
+}
+if ((rc10 != NULL) && ((task == NULL) || (task->status != 
SCSI_STATUS_GOOD))) {
 error_setg(errp, "iSCSI: failed to send readcapacity10 command.");
 } else if (!iscsilun->block_size ||
iscsilun->block_size % BDRV_SECTOR_SIZE) {
-- 
2.6.2