Re: [PATCH v2 2/2] virtio: fix vq # for balloon

2024-07-10 Thread Daniel Verkamp
On Wed, Jul 10, 2024 at 1:39 PM Michael S. Tsirkin  wrote:
>
> On Wed, Jul 10, 2024 at 12:58:11PM -0700, Daniel Verkamp wrote:
> > On Wed, Jul 10, 2024 at 11:39 AM Michael S. Tsirkin  wrote:
> > >
> > > On Wed, Jul 10, 2024 at 11:12:34AM -0700, Daniel Verkamp wrote:
> > > > On Wed, Jul 10, 2024 at 4:43 AM Michael S. Tsirkin  
> > > > wrote:
> > > > >
> > > > > virtio balloon communicates to the core that in some
> > > > > configurations vq #s are non-contiguous by setting name
> > > > > pointer to NULL.
> > > > >
> > > > > Unfortunately, core then turned around and just made them
> > > > > contiguous again. Result is that driver is out of spec.
> > > >
> > > > Thanks for fixing this - I think the overall approach of the patch 
> > > > looks good.
> > > >
> > > > > Implement what the API was supposed to do
> > > > > in the 1st place. Compatibility with buggy hypervisors
> > > > > is handled inside virtio-balloon, which is the only driver
> > > > > making use of this facility, so far.
> > > >
> > > > In addition to virtio-balloon, I believe the same problem also affects
> > > > the virtio-fs device, since queue 1 is only supposed to be present if
> > > > VIRTIO_FS_F_NOTIFICATION is negotiated, and the request queues are
> > > > meant to be queue indexes 2 and up. From a look at the Linux driver
> > > > (virtio_fs.c), it appears like it never acks VIRTIO_FS_F_NOTIFICATION
> > > > and assumes that request queues start at index 1 rather than 2, which
> > > > looks out of spec to me, but the current device implementations (that
> > > > I am aware of, anyway) are also broken in the same way, so it ends up
> > > > working today. Queue numbering in a spec-compliant device and the
> > > > current Linux driver would mismatch; what the driver considers to be
> > > > the first request queue (index 1) would be ignored by the device since
> > > > queue index 1 has no function if F_NOTIFICATION isn't negotiated.
> > >
> > >
> > > Oh, thanks a lot for pointing this out!
> > >
> > > I see so this patch is no good as is, we need to add a workaround for
> > > virtio-fs first.
> > >
> > > QEMU workaround is simple - just add an extra queue. But I did not
> > > reasearch how this would interact with vhost-user.
> > >
> > > From driver POV, I guess we could just ignore queue # 1 - would that be
> > > ok or does it have performance implications?
> >
> > As a driver workaround for non-compliant devices, I think ignoring the
> > first request queue would be a reasonable approach if the device's
> > config advertises num_request_queues > 1. Unfortunately, both
> > virtiofsd and crosvm's virtio-fs device have hard-coded
> > num_request_queues =1, so this won't help with those existing devices.
>
> Do they care what the vq # is though?
> We could do some magic to translate VQ #s in qemu.
>
>
> > Maybe there are other devices that we would need to consider as well;
> > commit 529395d2ae64 ("virtio-fs: add multi-queue support") quotes
> > benchmarks that seem to be from a different virtio-fs implementation
> > that does support multiple request queues, so the workaround could
> > possibly be used there.
> >
> > > Or do what I did for balloon here: try with spec compliant #s first,
> > > if that fails then assume it's the spec issue and shift by 1.
> >
> > If there is a way to "guess and check" without breaking spec-compliant
> > devices, that sounds reasonable too; however, I'm not sure how this
> > would work out in practice: an existing non-compliant device may fail
> > to start if the driver tries to enable queue index 2 when it only
> > supports one request queue,
>
> You don't try to enable queue - driver starts by checking queue size.
> The way my patch works is that it assumes a non existing queue has
> size 0 if not available.
>
> This was actually a documented way to check for PCI and MMIO:
> Read the virtqueue size from queue_size. This controls how big the 
> virtqueue is (see 2.6 Virtqueues).
> If this field is 0, the virtqueue does not exist.
> MMIO:
> If the returned value is zero (0x0) the queue is not available.
>
> unfortunately not for CCW, but I guess CCW implementations outside
> of QEMU are uncommon enough that we can assume it's the same?
>
>
> To me the abo

Re: [PATCH v2 2/2] virtio: fix vq # for balloon

2024-07-10 Thread Daniel Verkamp
On Wed, Jul 10, 2024 at 11:39 AM Michael S. Tsirkin  wrote:
>
> On Wed, Jul 10, 2024 at 11:12:34AM -0700, Daniel Verkamp wrote:
> > On Wed, Jul 10, 2024 at 4:43 AM Michael S. Tsirkin  wrote:
> > >
> > > virtio balloon communicates to the core that in some
> > > configurations vq #s are non-contiguous by setting name
> > > pointer to NULL.
> > >
> > > Unfortunately, core then turned around and just made them
> > > contiguous again. Result is that driver is out of spec.
> >
> > Thanks for fixing this - I think the overall approach of the patch looks 
> > good.
> >
> > > Implement what the API was supposed to do
> > > in the 1st place. Compatibility with buggy hypervisors
> > > is handled inside virtio-balloon, which is the only driver
> > > making use of this facility, so far.
> >
> > In addition to virtio-balloon, I believe the same problem also affects
> > the virtio-fs device, since queue 1 is only supposed to be present if
> > VIRTIO_FS_F_NOTIFICATION is negotiated, and the request queues are
> > meant to be queue indexes 2 and up. From a look at the Linux driver
> > (virtio_fs.c), it appears like it never acks VIRTIO_FS_F_NOTIFICATION
> > and assumes that request queues start at index 1 rather than 2, which
> > looks out of spec to me, but the current device implementations (that
> > I am aware of, anyway) are also broken in the same way, so it ends up
> > working today. Queue numbering in a spec-compliant device and the
> > current Linux driver would mismatch; what the driver considers to be
> > the first request queue (index 1) would be ignored by the device since
> > queue index 1 has no function if F_NOTIFICATION isn't negotiated.
>
>
> Oh, thanks a lot for pointing this out!
>
> I see so this patch is no good as is, we need to add a workaround for
> virtio-fs first.
>
> QEMU workaround is simple - just add an extra queue. But I did not
> reasearch how this would interact with vhost-user.
>
> From driver POV, I guess we could just ignore queue # 1 - would that be
> ok or does it have performance implications?

As a driver workaround for non-compliant devices, I think ignoring the
first request queue would be a reasonable approach if the device's
config advertises num_request_queues > 1. Unfortunately, both
virtiofsd and crosvm's virtio-fs device have hard-coded
num_request_queues =1, so this won't help with those existing devices.
Maybe there are other devices that we would need to consider as well;
commit 529395d2ae64 ("virtio-fs: add multi-queue support") quotes
benchmarks that seem to be from a different virtio-fs implementation
that does support multiple request queues, so the workaround could
possibly be used there.

> Or do what I did for balloon here: try with spec compliant #s first,
> if that fails then assume it's the spec issue and shift by 1.

If there is a way to "guess and check" without breaking spec-compliant
devices, that sounds reasonable too; however, I'm not sure how this
would work out in practice: an existing non-compliant device may fail
to start if the driver tries to enable queue index 2 when it only
supports one request queue, and a spec-compliant device would probably
balk if the driver tries to enable queue 1 but does not negotiate
VIRTIO_FS_F_NOTIFICATION. If there's a way to reset and retry the
whole virtio device initialization process if a device fails like
this, then maybe it's feasible. (Or can the driver tweak the virtqueue
configuration and try to set DRIVER_OK repeatedly until it works? It's
not clear to me if this is allowed by the spec, or what device
implementations actually do in practice in this scenario.)

Thanks,
-- Daniel



Re: [PATCH v2 2/2] virtio: fix vq # for balloon

2024-07-10 Thread Daniel Verkamp
On Wed, Jul 10, 2024 at 4:43 AM Michael S. Tsirkin  wrote:
>
> virtio balloon communicates to the core that in some
> configurations vq #s are non-contiguous by setting name
> pointer to NULL.
>
> Unfortunately, core then turned around and just made them
> contiguous again. Result is that driver is out of spec.

Thanks for fixing this - I think the overall approach of the patch looks good.

> Implement what the API was supposed to do
> in the 1st place. Compatibility with buggy hypervisors
> is handled inside virtio-balloon, which is the only driver
> making use of this facility, so far.

In addition to virtio-balloon, I believe the same problem also affects
the virtio-fs device, since queue 1 is only supposed to be present if
VIRTIO_FS_F_NOTIFICATION is negotiated, and the request queues are
meant to be queue indexes 2 and up. From a look at the Linux driver
(virtio_fs.c), it appears like it never acks VIRTIO_FS_F_NOTIFICATION
and assumes that request queues start at index 1 rather than 2, which
looks out of spec to me, but the current device implementations (that
I am aware of, anyway) are also broken in the same way, so it ends up
working today. Queue numbering in a spec-compliant device and the
current Linux driver would mismatch; what the driver considers to be
the first request queue (index 1) would be ignored by the device since
queue index 1 has no function if F_NOTIFICATION isn't negotiated.

[...]
> diff --git a/drivers/virtio/virtio_pci_common.c 
> b/drivers/virtio/virtio_pci_common.c
> index 7d82facafd75..fa606e7321ad 100644
> --- a/drivers/virtio/virtio_pci_common.c
> +++ b/drivers/virtio/virtio_pci_common.c
> @@ -293,7 +293,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, 
> unsigned int nvqs,
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> struct virtqueue_info *vqi;
> u16 msix_vec;
> -   int i, err, nvectors, allocated_vectors, queue_idx = 0;
> +   int i, err, nvectors, allocated_vectors;
>
> vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL);
> if (!vp_dev->vqs)
> @@ -332,7 +332,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, 
> unsigned int nvqs,
> msix_vec = allocated_vectors++;
> else
> msix_vec = VP_MSIX_VQ_VECTOR;
> -   vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback,
> +   vqs[i] = vp_setup_vq(vdev, i, vqi->callback,
>  vqi->name, vqi->ctx, msix_vec);
> if (IS_ERR(vqs[i])) {
> err = PTR_ERR(vqs[i]);
> @@ -368,7 +368,7 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, 
> unsigned int nvqs,
> struct virtqueue_info vqs_info[])
>  {
> struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> -   int i, err, queue_idx = 0;
> +   int i, err;
>
> vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL);
> if (!vp_dev->vqs)
> @@ -388,8 +388,13 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, 
> unsigned int nvqs,
> vqs[i] = NULL;
> continue;
> }
> +<<< HEAD
> vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback,
>  vqi->name, vqi->ctx,
> +===
> +   vqs[i] = vp_setup_vq(vdev, i, callbacks[i], names[i],
> +ctx ? ctx[i] : false,
> +>>> f814759f80b7... virtio: fix vq # for balloon

This still has merge markers in it.

Thanks,
-- Daniel



[PATCH v2] ntb: initialize max_mw for Atom before using it

2015-05-13 Thread Daniel Verkamp
Commit ab760a0 (ntb: Adding split BAR support for Haswell platforms)
changed ntb_device's mw from a fixed-size array into a pointer that is
allocated based on limits.max_mw; however, on Atom platforms, max_mw
is not initialized until ntb_device_setup(), which happens after the
allocation.

Fill out max_mw in ntb_atom_detect() to match ntb_xeon_detect(); this
happens before the use of max_mw in the ndev->mw allocation.

Fixes a null pointer dereference on Atom platforms with ntb hardware.

v2: fix typo (mw_max should be max_mw)

Signed-off-by: Daniel Verkamp 
Acked-by: Dave Jiang 
---
 drivers/ntb/ntb_hw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c
index cd29b10..d14ca0d 100644
--- a/drivers/ntb/ntb_hw.c
+++ b/drivers/ntb/ntb_hw.c
@@ -1660,6 +1660,7 @@ static int ntb_atom_detect(struct ntb_device *ndev)
u32 ppd;
 
ndev->hw_type = BWD_HW;
+   ndev->limits.max_mw = BWD_MAX_MW;
 
rc = pci_read_config_dword(ndev->pdev, NTB_PPD_OFFSET, );
if (rc)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ntb: initialize max_mw for Atom before using it

2015-05-13 Thread Daniel Verkamp
Commit ab760a0 (ntb: Adding split BAR support for Haswell platforms)
changed ntb_device's mw from a fixed-size array into a pointer that is
allocated based on limits.max_mw; however, on Atom platforms, max_mw
is not initialized until ntb_device_setup(), which happens after the
allocation.

Fill out max_mw in ntb_atom_detect() to match ntb_xeon_detect(); this
happens before the use of max_mw in the ndev->mw allocation.

Fixes a null pointer dereference on Atom platforms with ntb hardware.

Signed-off-by: Daniel Verkamp 
Acked-by: Dave Jiang 
---
 drivers/ntb/ntb_hw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c
index cd29b10..2f065f0 100644
--- a/drivers/ntb/ntb_hw.c
+++ b/drivers/ntb/ntb_hw.c
@@ -1660,6 +1660,7 @@ static int ntb_atom_detect(struct ntb_device *ndev)
u32 ppd;
 
ndev->hw_type = BWD_HW;
+   ndev->limits.mw_max = BWD_MAX_MW;
 
rc = pci_read_config_dword(ndev->pdev, NTB_PPD_OFFSET, );
if (rc)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2] ntb: initialize max_mw for Atom before using it

2015-05-13 Thread Daniel Verkamp
Commit ab760a0 (ntb: Adding split BAR support for Haswell platforms)
changed ntb_device's mw from a fixed-size array into a pointer that is
allocated based on limits.max_mw; however, on Atom platforms, max_mw
is not initialized until ntb_device_setup(), which happens after the
allocation.

Fill out max_mw in ntb_atom_detect() to match ntb_xeon_detect(); this
happens before the use of max_mw in the ndev-mw allocation.

Fixes a null pointer dereference on Atom platforms with ntb hardware.

v2: fix typo (mw_max should be max_mw)

Signed-off-by: Daniel Verkamp daniel.verk...@intel.com
Acked-by: Dave Jiang dave.ji...@intel.com
---
 drivers/ntb/ntb_hw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c
index cd29b10..d14ca0d 100644
--- a/drivers/ntb/ntb_hw.c
+++ b/drivers/ntb/ntb_hw.c
@@ -1660,6 +1660,7 @@ static int ntb_atom_detect(struct ntb_device *ndev)
u32 ppd;
 
ndev-hw_type = BWD_HW;
+   ndev-limits.max_mw = BWD_MAX_MW;
 
rc = pci_read_config_dword(ndev-pdev, NTB_PPD_OFFSET, ppd);
if (rc)
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ntb: initialize max_mw for Atom before using it

2015-05-13 Thread Daniel Verkamp
Commit ab760a0 (ntb: Adding split BAR support for Haswell platforms)
changed ntb_device's mw from a fixed-size array into a pointer that is
allocated based on limits.max_mw; however, on Atom platforms, max_mw
is not initialized until ntb_device_setup(), which happens after the
allocation.

Fill out max_mw in ntb_atom_detect() to match ntb_xeon_detect(); this
happens before the use of max_mw in the ndev-mw allocation.

Fixes a null pointer dereference on Atom platforms with ntb hardware.

Signed-off-by: Daniel Verkamp daniel.verk...@intel.com
Acked-by: Dave Jiang dave.ji...@intel.com
---
 drivers/ntb/ntb_hw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c
index cd29b10..2f065f0 100644
--- a/drivers/ntb/ntb_hw.c
+++ b/drivers/ntb/ntb_hw.c
@@ -1660,6 +1660,7 @@ static int ntb_atom_detect(struct ntb_device *ndev)
u32 ppd;
 
ndev-hw_type = BWD_HW;
+   ndev-limits.mw_max = BWD_MAX_MW;
 
rc = pci_read_config_dword(ndev-pdev, NTB_PPD_OFFSET, ppd);
if (rc)
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/