Re: [PATCH v2 2/2] virtio: fix vq # for balloon
On Wed, Jul 10, 2024 at 1:39 PM Michael S. Tsirkin wrote: > > On Wed, Jul 10, 2024 at 12:58:11PM -0700, Daniel Verkamp wrote: > > On Wed, Jul 10, 2024 at 11:39 AM Michael S. Tsirkin wrote: > > > > > > On Wed, Jul 10, 2024 at 11:12:34AM -0700, Daniel Verkamp wrote: > > > > On Wed, Jul 10, 2024 at 4:43 AM Michael S. Tsirkin > > > > wrote: > > > > > > > > > > virtio balloon communicates to the core that in some > > > > > configurations vq #s are non-contiguous by setting name > > > > > pointer to NULL. > > > > > > > > > > Unfortunately, core then turned around and just made them > > > > > contiguous again. Result is that driver is out of spec. > > > > > > > > Thanks for fixing this - I think the overall approach of the patch > > > > looks good. > > > > > > > > > Implement what the API was supposed to do > > > > > in the 1st place. Compatibility with buggy hypervisors > > > > > is handled inside virtio-balloon, which is the only driver > > > > > making use of this facility, so far. > > > > > > > > In addition to virtio-balloon, I believe the same problem also affects > > > > the virtio-fs device, since queue 1 is only supposed to be present if > > > > VIRTIO_FS_F_NOTIFICATION is negotiated, and the request queues are > > > > meant to be queue indexes 2 and up. From a look at the Linux driver > > > > (virtio_fs.c), it appears like it never acks VIRTIO_FS_F_NOTIFICATION > > > > and assumes that request queues start at index 1 rather than 2, which > > > > looks out of spec to me, but the current device implementations (that > > > > I am aware of, anyway) are also broken in the same way, so it ends up > > > > working today. Queue numbering in a spec-compliant device and the > > > > current Linux driver would mismatch; what the driver considers to be > > > > the first request queue (index 1) would be ignored by the device since > > > > queue index 1 has no function if F_NOTIFICATION isn't negotiated. > > > > > > > > > Oh, thanks a lot for pointing this out! > > > > > > I see so this patch is no good as is, we need to add a workaround for > > > virtio-fs first. > > > > > > QEMU workaround is simple - just add an extra queue. But I did not > > > reasearch how this would interact with vhost-user. > > > > > > From driver POV, I guess we could just ignore queue # 1 - would that be > > > ok or does it have performance implications? > > > > As a driver workaround for non-compliant devices, I think ignoring the > > first request queue would be a reasonable approach if the device's > > config advertises num_request_queues > 1. Unfortunately, both > > virtiofsd and crosvm's virtio-fs device have hard-coded > > num_request_queues =1, so this won't help with those existing devices. > > Do they care what the vq # is though? > We could do some magic to translate VQ #s in qemu. > > > > Maybe there are other devices that we would need to consider as well; > > commit 529395d2ae64 ("virtio-fs: add multi-queue support") quotes > > benchmarks that seem to be from a different virtio-fs implementation > > that does support multiple request queues, so the workaround could > > possibly be used there. > > > > > Or do what I did for balloon here: try with spec compliant #s first, > > > if that fails then assume it's the spec issue and shift by 1. > > > > If there is a way to "guess and check" without breaking spec-compliant > > devices, that sounds reasonable too; however, I'm not sure how this > > would work out in practice: an existing non-compliant device may fail > > to start if the driver tries to enable queue index 2 when it only > > supports one request queue, > > You don't try to enable queue - driver starts by checking queue size. > The way my patch works is that it assumes a non existing queue has > size 0 if not available. > > This was actually a documented way to check for PCI and MMIO: > Read the virtqueue size from queue_size. This controls how big the > virtqueue is (see 2.6 Virtqueues). > If this field is 0, the virtqueue does not exist. > MMIO: > If the returned value is zero (0x0) the queue is not available. > > unfortunately not for CCW, but I guess CCW implementations outside > of QEMU are uncommon enough that we can assume it's the same? > > > To me the abo
Re: [PATCH v2 2/2] virtio: fix vq # for balloon
On Wed, Jul 10, 2024 at 11:39 AM Michael S. Tsirkin wrote: > > On Wed, Jul 10, 2024 at 11:12:34AM -0700, Daniel Verkamp wrote: > > On Wed, Jul 10, 2024 at 4:43 AM Michael S. Tsirkin wrote: > > > > > > virtio balloon communicates to the core that in some > > > configurations vq #s are non-contiguous by setting name > > > pointer to NULL. > > > > > > Unfortunately, core then turned around and just made them > > > contiguous again. Result is that driver is out of spec. > > > > Thanks for fixing this - I think the overall approach of the patch looks > > good. > > > > > Implement what the API was supposed to do > > > in the 1st place. Compatibility with buggy hypervisors > > > is handled inside virtio-balloon, which is the only driver > > > making use of this facility, so far. > > > > In addition to virtio-balloon, I believe the same problem also affects > > the virtio-fs device, since queue 1 is only supposed to be present if > > VIRTIO_FS_F_NOTIFICATION is negotiated, and the request queues are > > meant to be queue indexes 2 and up. From a look at the Linux driver > > (virtio_fs.c), it appears like it never acks VIRTIO_FS_F_NOTIFICATION > > and assumes that request queues start at index 1 rather than 2, which > > looks out of spec to me, but the current device implementations (that > > I am aware of, anyway) are also broken in the same way, so it ends up > > working today. Queue numbering in a spec-compliant device and the > > current Linux driver would mismatch; what the driver considers to be > > the first request queue (index 1) would be ignored by the device since > > queue index 1 has no function if F_NOTIFICATION isn't negotiated. > > > Oh, thanks a lot for pointing this out! > > I see so this patch is no good as is, we need to add a workaround for > virtio-fs first. > > QEMU workaround is simple - just add an extra queue. But I did not > reasearch how this would interact with vhost-user. > > From driver POV, I guess we could just ignore queue # 1 - would that be > ok or does it have performance implications? As a driver workaround for non-compliant devices, I think ignoring the first request queue would be a reasonable approach if the device's config advertises num_request_queues > 1. Unfortunately, both virtiofsd and crosvm's virtio-fs device have hard-coded num_request_queues =1, so this won't help with those existing devices. Maybe there are other devices that we would need to consider as well; commit 529395d2ae64 ("virtio-fs: add multi-queue support") quotes benchmarks that seem to be from a different virtio-fs implementation that does support multiple request queues, so the workaround could possibly be used there. > Or do what I did for balloon here: try with spec compliant #s first, > if that fails then assume it's the spec issue and shift by 1. If there is a way to "guess and check" without breaking spec-compliant devices, that sounds reasonable too; however, I'm not sure how this would work out in practice: an existing non-compliant device may fail to start if the driver tries to enable queue index 2 when it only supports one request queue, and a spec-compliant device would probably balk if the driver tries to enable queue 1 but does not negotiate VIRTIO_FS_F_NOTIFICATION. If there's a way to reset and retry the whole virtio device initialization process if a device fails like this, then maybe it's feasible. (Or can the driver tweak the virtqueue configuration and try to set DRIVER_OK repeatedly until it works? It's not clear to me if this is allowed by the spec, or what device implementations actually do in practice in this scenario.) Thanks, -- Daniel
Re: [PATCH v2 2/2] virtio: fix vq # for balloon
On Wed, Jul 10, 2024 at 4:43 AM Michael S. Tsirkin wrote: > > virtio balloon communicates to the core that in some > configurations vq #s are non-contiguous by setting name > pointer to NULL. > > Unfortunately, core then turned around and just made them > contiguous again. Result is that driver is out of spec. Thanks for fixing this - I think the overall approach of the patch looks good. > Implement what the API was supposed to do > in the 1st place. Compatibility with buggy hypervisors > is handled inside virtio-balloon, which is the only driver > making use of this facility, so far. In addition to virtio-balloon, I believe the same problem also affects the virtio-fs device, since queue 1 is only supposed to be present if VIRTIO_FS_F_NOTIFICATION is negotiated, and the request queues are meant to be queue indexes 2 and up. From a look at the Linux driver (virtio_fs.c), it appears like it never acks VIRTIO_FS_F_NOTIFICATION and assumes that request queues start at index 1 rather than 2, which looks out of spec to me, but the current device implementations (that I am aware of, anyway) are also broken in the same way, so it ends up working today. Queue numbering in a spec-compliant device and the current Linux driver would mismatch; what the driver considers to be the first request queue (index 1) would be ignored by the device since queue index 1 has no function if F_NOTIFICATION isn't negotiated. [...] > diff --git a/drivers/virtio/virtio_pci_common.c > b/drivers/virtio/virtio_pci_common.c > index 7d82facafd75..fa606e7321ad 100644 > --- a/drivers/virtio/virtio_pci_common.c > +++ b/drivers/virtio/virtio_pci_common.c > @@ -293,7 +293,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, > unsigned int nvqs, > struct virtio_pci_device *vp_dev = to_vp_device(vdev); > struct virtqueue_info *vqi; > u16 msix_vec; > - int i, err, nvectors, allocated_vectors, queue_idx = 0; > + int i, err, nvectors, allocated_vectors; > > vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL); > if (!vp_dev->vqs) > @@ -332,7 +332,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, > unsigned int nvqs, > msix_vec = allocated_vectors++; > else > msix_vec = VP_MSIX_VQ_VECTOR; > - vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback, > + vqs[i] = vp_setup_vq(vdev, i, vqi->callback, > vqi->name, vqi->ctx, msix_vec); > if (IS_ERR(vqs[i])) { > err = PTR_ERR(vqs[i]); > @@ -368,7 +368,7 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, > unsigned int nvqs, > struct virtqueue_info vqs_info[]) > { > struct virtio_pci_device *vp_dev = to_vp_device(vdev); > - int i, err, queue_idx = 0; > + int i, err; > > vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL); > if (!vp_dev->vqs) > @@ -388,8 +388,13 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, > unsigned int nvqs, > vqs[i] = NULL; > continue; > } > +<<< HEAD > vqs[i] = vp_setup_vq(vdev, queue_idx++, vqi->callback, > vqi->name, vqi->ctx, > +=== > + vqs[i] = vp_setup_vq(vdev, i, callbacks[i], names[i], > +ctx ? ctx[i] : false, > +>>> f814759f80b7... virtio: fix vq # for balloon This still has merge markers in it. Thanks, -- Daniel
[PATCH v2] ntb: initialize max_mw for Atom before using it
Commit ab760a0 (ntb: Adding split BAR support for Haswell platforms) changed ntb_device's mw from a fixed-size array into a pointer that is allocated based on limits.max_mw; however, on Atom platforms, max_mw is not initialized until ntb_device_setup(), which happens after the allocation. Fill out max_mw in ntb_atom_detect() to match ntb_xeon_detect(); this happens before the use of max_mw in the ndev->mw allocation. Fixes a null pointer dereference on Atom platforms with ntb hardware. v2: fix typo (mw_max should be max_mw) Signed-off-by: Daniel Verkamp Acked-by: Dave Jiang --- drivers/ntb/ntb_hw.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c index cd29b10..d14ca0d 100644 --- a/drivers/ntb/ntb_hw.c +++ b/drivers/ntb/ntb_hw.c @@ -1660,6 +1660,7 @@ static int ntb_atom_detect(struct ntb_device *ndev) u32 ppd; ndev->hw_type = BWD_HW; + ndev->limits.max_mw = BWD_MAX_MW; rc = pci_read_config_dword(ndev->pdev, NTB_PPD_OFFSET, ); if (rc) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ntb: initialize max_mw for Atom before using it
Commit ab760a0 (ntb: Adding split BAR support for Haswell platforms) changed ntb_device's mw from a fixed-size array into a pointer that is allocated based on limits.max_mw; however, on Atom platforms, max_mw is not initialized until ntb_device_setup(), which happens after the allocation. Fill out max_mw in ntb_atom_detect() to match ntb_xeon_detect(); this happens before the use of max_mw in the ndev->mw allocation. Fixes a null pointer dereference on Atom platforms with ntb hardware. Signed-off-by: Daniel Verkamp Acked-by: Dave Jiang --- drivers/ntb/ntb_hw.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c index cd29b10..2f065f0 100644 --- a/drivers/ntb/ntb_hw.c +++ b/drivers/ntb/ntb_hw.c @@ -1660,6 +1660,7 @@ static int ntb_atom_detect(struct ntb_device *ndev) u32 ppd; ndev->hw_type = BWD_HW; + ndev->limits.mw_max = BWD_MAX_MW; rc = pci_read_config_dword(ndev->pdev, NTB_PPD_OFFSET, ); if (rc) -- 2.1.0 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v2] ntb: initialize max_mw for Atom before using it
Commit ab760a0 (ntb: Adding split BAR support for Haswell platforms) changed ntb_device's mw from a fixed-size array into a pointer that is allocated based on limits.max_mw; however, on Atom platforms, max_mw is not initialized until ntb_device_setup(), which happens after the allocation. Fill out max_mw in ntb_atom_detect() to match ntb_xeon_detect(); this happens before the use of max_mw in the ndev-mw allocation. Fixes a null pointer dereference on Atom platforms with ntb hardware. v2: fix typo (mw_max should be max_mw) Signed-off-by: Daniel Verkamp daniel.verk...@intel.com Acked-by: Dave Jiang dave.ji...@intel.com --- drivers/ntb/ntb_hw.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c index cd29b10..d14ca0d 100644 --- a/drivers/ntb/ntb_hw.c +++ b/drivers/ntb/ntb_hw.c @@ -1660,6 +1660,7 @@ static int ntb_atom_detect(struct ntb_device *ndev) u32 ppd; ndev-hw_type = BWD_HW; + ndev-limits.max_mw = BWD_MAX_MW; rc = pci_read_config_dword(ndev-pdev, NTB_PPD_OFFSET, ppd); if (rc) -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] ntb: initialize max_mw for Atom before using it
Commit ab760a0 (ntb: Adding split BAR support for Haswell platforms) changed ntb_device's mw from a fixed-size array into a pointer that is allocated based on limits.max_mw; however, on Atom platforms, max_mw is not initialized until ntb_device_setup(), which happens after the allocation. Fill out max_mw in ntb_atom_detect() to match ntb_xeon_detect(); this happens before the use of max_mw in the ndev-mw allocation. Fixes a null pointer dereference on Atom platforms with ntb hardware. Signed-off-by: Daniel Verkamp daniel.verk...@intel.com Acked-by: Dave Jiang dave.ji...@intel.com --- drivers/ntb/ntb_hw.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/ntb/ntb_hw.c b/drivers/ntb/ntb_hw.c index cd29b10..2f065f0 100644 --- a/drivers/ntb/ntb_hw.c +++ b/drivers/ntb/ntb_hw.c @@ -1660,6 +1660,7 @@ static int ntb_atom_detect(struct ntb_device *ndev) u32 ppd; ndev-hw_type = BWD_HW; + ndev-limits.mw_max = BWD_MAX_MW; rc = pci_read_config_dword(ndev-pdev, NTB_PPD_OFFSET, ppd); if (rc) -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/