Re: [PATCH] virtio_blk: set the default scheduler to none
On Thu, Dec 7, 2023 at 12:33 PM Li Feng wrote: > > virtio-blk is generally used in cloud computing scenarios, where the > performance of virtual disks is very important. The mq-deadline scheduler > has a big performance drop compared to none with single queue. At least you can choose the scheduler based on if mq is supported or not? Thanks
Re: [PATCH 0/3] Fix virtio-blk issue with SWIOTLB
On 2019/1/14 下午5:50, Christoph Hellwig wrote: On Mon, Jan 14, 2019 at 05:41:56PM +0800, Jason Wang wrote: On 2019/1/11 下午5:15, Joerg Roedel wrote: On Fri, Jan 11, 2019 at 11:29:31AM +0800, Jason Wang wrote: Just wonder if my understanding is correct IOMMU_PLATFORM must be set for all virtio devices under AMD-SEV guests? Yes, that is correct. Emulated DMA can only happen on the SWIOTLB aperture, because that memory is not encrypted. The guest bounces the data then to its encrypted memory. Regards, Joerg Thanks, have you tested vhost-net in this case. I suspect it may not work Which brings me back to my pet pevee that we need to take actions that virtio uses the proper dma mapping API by default with quirks for legacy cases. The magic bypass it uses is just causing problems over problems. Yes, I fully agree with you. This is probably an exact example of such problem. Thanks
Re: [PATCH 0/3] Fix virtio-blk issue with SWIOTLB
On 2019/1/11 下午5:15, Joerg Roedel wrote: On Fri, Jan 11, 2019 at 11:29:31AM +0800, Jason Wang wrote: Just wonder if my understanding is correct IOMMU_PLATFORM must be set for all virtio devices under AMD-SEV guests? Yes, that is correct. Emulated DMA can only happen on the SWIOTLB aperture, because that memory is not encrypted. The guest bounces the data then to its encrypted memory. Regards, Joerg Thanks, have you tested vhost-net in this case. I suspect it may not work
Re: [PATCH 0/3] Fix virtio-blk issue with SWIOTLB
On 2019/1/10 下午9:44, Joerg Roedel wrote: Hi, there is a problem with virtio-blk driven devices when virtio-ring uses SWIOTLB through the DMA-API. This happens for example in AMD-SEV enabled guests, where the guest RAM is mostly encrypted and all emulated DMA has to happen to/from the SWIOTLB aperture. The problem is a limitation of the SWIOTLB implementation, which does not support allocations larger than 256kb. When the virtio-blk driver tries to read/write a block larger than that, the allocation of the dma-handle fails and an IO error is reported. This patch-set adds a check to the virtio-code whether it might be using SWIOTLB bounce buffering and limits the maximum segment size in the virtio-blk driver in this case, so that it doesn't try to do larger reads/writes. Just wonder if my understanding is correct IOMMU_PLATFORM must be set for all virtio devices under AMD-SEV guests? Thanks Please review. Thanks, Joerg Joerg Roedel (3): swiotlb: Export maximum allocation size virtio: Introduce virtio_max_dma_size() virtio-blk: Consider virtio_max_dma_size() for maximum segment size drivers/block/virtio_blk.c | 10 ++ drivers/virtio/virtio_ring.c | 11 +++ include/linux/swiotlb.h | 12 include/linux/virtio.h | 2 ++ 4 files changed, 31 insertions(+), 4 deletions(-)
Re: [PATCH 1/9] virtio_pci: remove struct virtio_pci_vq_info
On 2017年02月07日 17:38, Christoph Hellwig wrote: On Tue, Feb 07, 2017 at 03:17:02PM +0800, Jason Wang wrote: The check is still there. Meh, I could swear I fixed it up. Here is an updated version: --- From bf5e3b7fd272aea32388570503f00d0ab592fc2a Mon Sep 17 00:00:00 2001 From: Christoph Hellwig Date: Wed, 25 Jan 2017 13:40:21 +0100 Subject: virtio_pci: remove struct virtio_pci_vq_info We don't really need struct virtio_pci_vq_info, as most field in there are redundant: - the vq backpointer is not strictly neede to start with - the entry in the vqs list is not needed - the generic virtqueue already has list, we only need to check if it has a callback to get the same semantics - we can use a simple array to look up the MSI-X vec if needed. - That simple array now also duoble serves to replace the per_vq_vectors flag Signed-off-by: Christoph Hellwig Reviewed-by: Jason Wang --- drivers/virtio/virtio_pci_common.c | 117 +++-- drivers/virtio/virtio_pci_common.h | 25 +--- drivers/virtio/virtio_pci_legacy.c | 6 +- drivers/virtio/virtio_pci_modern.c | 6 +- 4 files changed, 39 insertions(+), 115 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 186cbab327b8..1f9fac7dad61 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -62,16 +62,13 @@ static irqreturn_t vp_config_changed(int irq, void *opaque) static irqreturn_t vp_vring_interrupt(int irq, void *opaque) { struct virtio_pci_device *vp_dev = opaque; - struct virtio_pci_vq_info *info; irqreturn_t ret = IRQ_NONE; - unsigned long flags; + struct virtqueue *vq; - spin_lock_irqsave(&vp_dev->lock, flags); - list_for_each_entry(info, &vp_dev->virtqueues, node) { - if (vring_interrupt(irq, info->vq) == IRQ_HANDLED) + list_for_each_entry(vq, &vp_dev->vdev.vqs, list) { + if (vring_interrupt(irq, vq) == IRQ_HANDLED) ret = IRQ_HANDLED; } - spin_unlock_irqrestore(&vp_dev->lock, flags); return ret; } @@ -167,55 +164,6 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors, return err; } -static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, -void (*callback)(struct virtqueue *vq), -const char *name, -u16 msix_vec) -{ - struct virtio_pci_device *vp_dev = to_vp_device(vdev); - struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL); - struct virtqueue *vq; - unsigned long flags; - - /* fill out our structure that represents an active queue */ - if (!info) - return ERR_PTR(-ENOMEM); - - vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, msix_vec); - if (IS_ERR(vq)) - goto out_info; - - info->vq = vq; - if (callback) { - spin_lock_irqsave(&vp_dev->lock, flags); - list_add(&info->node, &vp_dev->virtqueues); - spin_unlock_irqrestore(&vp_dev->lock, flags); - } else { - INIT_LIST_HEAD(&info->node); - } - - vp_dev->vqs[index] = info; - return vq; - -out_info: - kfree(info); - return vq; -} - -static void vp_del_vq(struct virtqueue *vq) -{ - struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev); - struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index]; - unsigned long flags; - - spin_lock_irqsave(&vp_dev->lock, flags); - list_del(&info->node); - spin_unlock_irqrestore(&vp_dev->lock, flags); - - vp_dev->del_vq(info); - kfree(info); -} - /* the config->del_vqs() implementation */ void vp_del_vqs(struct virtio_device *vdev) { @@ -224,16 +172,15 @@ void vp_del_vqs(struct virtio_device *vdev) int i; list_for_each_entry_safe(vq, n, &vdev->vqs, list) { - if (vp_dev->per_vq_vectors) { - int v = vp_dev->vqs[vq->index]->msix_vector; + if (vp_dev->msix_vector_map) { + int v = vp_dev->msix_vector_map[vq->index]; if (v != VIRTIO_MSI_NO_VECTOR) free_irq(pci_irq_vector(vp_dev->pci_dev, v), vq); } - vp_del_vq(vq); + vp_dev->del_vq(vq); } - vp_dev->per_vq_vectors = false; if (vp_dev->intx_enabled) { free_irq(vp_dev->pci_dev->irq, vp_dev); @@ -261,8 +208,8 @@ void vp_del_vqs(struct virtio_device *vdev) vp_dev->msix_names = NULL; kfree(vp
Re: [PATCH 2/9] virtio_pci: use shared interrupts for virtqueues
On 2017年02月06日 01:15, Christoph Hellwig wrote: This lets IRQ layer handle dispatching IRQs to separate handlers for the case where we don't have per-VQ MSI-X vectors, and allows us to greatly simplify the code based on the assumption that we always have interrupt vector 0 (legacy INTx or config interrupt for MSI-X) available, and any other interrupt is request/freed throught the VQ, even if the actual interrupt line might be shared in some cases. This allows removing a great deal of variables keeping track of the interrupt state in struct virtio_pci_device, as we can now simply walk the list of VQs and deal with per-VQ interrupt handlers there, and only treat vector 0 special. Additionally clean up the VQ allocation code to properly unwind on error instead of having a single global cleanup label, which is error prone, and in this case also leads to more code. Signed-off-by: Christoph Hellwig Reviewed-by: Jason Wang --- drivers/virtio/virtio_pci_common.c | 235 - drivers/virtio/virtio_pci_common.h | 16 +-- 2 files changed, 106 insertions(+), 145 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index a33767318cbf..274dc1ff09c0 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -33,10 +33,8 @@ void vp_synchronize_vectors(struct virtio_device *vdev) struct virtio_pci_device *vp_dev = to_vp_device(vdev); int i; - if (vp_dev->intx_enabled) - synchronize_irq(vp_dev->pci_dev->irq); - - for (i = 0; i < vp_dev->msix_vectors; ++i) + synchronize_irq(pci_irq_vector(vp_dev->pci_dev, 0)); + for (i = 1; i < vp_dev->msix_vectors; i++) synchronize_irq(pci_irq_vector(vp_dev->pci_dev, i)); } @@ -99,77 +97,10 @@ static irqreturn_t vp_interrupt(int irq, void *opaque) return vp_vring_interrupt(irq, opaque); } -static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors, - bool per_vq_vectors) -{ - struct virtio_pci_device *vp_dev = to_vp_device(vdev); - const char *name = dev_name(&vp_dev->vdev.dev); - unsigned i, v; - int err = -ENOMEM; - - vp_dev->msix_vectors = nvectors; - - vp_dev->msix_names = kmalloc(nvectors * sizeof *vp_dev->msix_names, -GFP_KERNEL); - if (!vp_dev->msix_names) - goto error; - vp_dev->msix_affinity_masks - = kzalloc(nvectors * sizeof *vp_dev->msix_affinity_masks, - GFP_KERNEL); - if (!vp_dev->msix_affinity_masks) - goto error; - for (i = 0; i < nvectors; ++i) - if (!alloc_cpumask_var(&vp_dev->msix_affinity_masks[i], - GFP_KERNEL)) - goto error; - - err = pci_alloc_irq_vectors(vp_dev->pci_dev, nvectors, nvectors, - PCI_IRQ_MSIX); - if (err < 0) - goto error; - vp_dev->msix_enabled = 1; - - /* Set the vector used for configuration */ - v = vp_dev->msix_used_vectors; - snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names, -"%s-config", name); - err = request_irq(pci_irq_vector(vp_dev->pci_dev, v), - vp_config_changed, 0, vp_dev->msix_names[v], - vp_dev); - if (err) - goto error; - ++vp_dev->msix_used_vectors; - - v = vp_dev->config_vector(vp_dev, v); - /* Verify we had enough resources to assign the vector */ - if (v == VIRTIO_MSI_NO_VECTOR) { - err = -EBUSY; - goto error; - } - - if (!per_vq_vectors) { - /* Shared vector for all VQs */ - v = vp_dev->msix_used_vectors; - snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names, -"%s-virtqueues", name); - err = request_irq(pci_irq_vector(vp_dev->pci_dev, v), - vp_vring_interrupt, 0, vp_dev->msix_names[v], - vp_dev); - if (err) - goto error; - ++vp_dev->msix_used_vectors; - } - return 0; -error: - return err; -} - -/* the config->del_vqs() implementation */ -void vp_del_vqs(struct virtio_device *vdev) +static void vp_remove_vqs(struct virtio_device *vdev) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); struct virtqueue *vq, *n; - int i; list_for_each_entry_safe(vq, n, &vdev->vqs, list) { if (vp_dev->msix_vector_map) { @@ -181,35 +112,33 @@ void vp_del_vqs(struct virtio_device *vdev)
Re: [PATCH 1/9] virtio_pci: remove struct virtio_pci_vq_info
On 2017年02月06日 01:15, Christoph Hellwig wrote: We don't really need struct virtio_pci_vq_info, as most field in there are redundant: - the vq backpointer is not strictly neede to start with - the entry in the vqs list is not needed - the generic virtqueue already has list, we only need to check if it has a callback to get the same semantics - we can use a simple array to look up the MSI-X vec if needed. - That simple array now also duoble serves to replace the per_vq_vectors flag Signed-off-by: Christoph Hellwig --- drivers/virtio/virtio_pci_common.c | 117 +++-- drivers/virtio/virtio_pci_common.h | 25 +--- drivers/virtio/virtio_pci_legacy.c | 6 +- drivers/virtio/virtio_pci_modern.c | 6 +- 4 files changed, 39 insertions(+), 115 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 186cbab327b8..a33767318cbf 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -62,16 +62,13 @@ static irqreturn_t vp_config_changed(int irq, void *opaque) static irqreturn_t vp_vring_interrupt(int irq, void *opaque) { struct virtio_pci_device *vp_dev = opaque; - struct virtio_pci_vq_info *info; irqreturn_t ret = IRQ_NONE; - unsigned long flags; + struct virtqueue *vq; - spin_lock_irqsave(&vp_dev->lock, flags); - list_for_each_entry(info, &vp_dev->virtqueues, node) { - if (vring_interrupt(irq, info->vq) == IRQ_HANDLED) + list_for_each_entry(vq, &vp_dev->vdev.vqs, list) { + if (vq->callback && vring_interrupt(irq, vq) == IRQ_HANDLED) ret = IRQ_HANDLED; } The check is still there. Thanks
Re: [PATCH 2/9] virtio_pci: use shared interrupts for virtqueues
On 2017年02月03日 17:52, Christoph Hellwig wrote: On Fri, Feb 03, 2017 at 05:47:41PM +0800, Jason Wang wrote: No, we need to allocate the array larger in that case as want proper names for the interrupts. Consider the case of !per_vq_vectors, the size of msix_names is 2, but snprintf can do out of bound accessing here. (We name the msix shared by virtqueues with something like "%s-virtqueues" before the patch). Yes, that's what I meant above - we need to allocate a large array starting with this patch. I'll fix it up for the next version. I see. Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] virtio_pci: use shared interrupts for virtqueues
On 2017年02月03日 16:26, Christoph Hellwig wrote: On Fri, Feb 03, 2017 at 03:54:54PM +0800, Jason Wang wrote: On 2017年01月27日 16:16, Christoph Hellwig wrote: + snprintf(vp_dev->msix_names[i + 1], +sizeof(*vp_dev->msix_names), "%s-%s", dev_name(&vp_dev->vdev.dev), names[i]); err = request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec), - vring_interrupt, 0, - vp_dev->msix_names[msix_vec], - vqs[i]); + vring_interrupt, IRQF_SHARED, + vp_dev->msix_names[i + 1], vqs[i]); Do we need to check per_vq_vectors before dereferencing msix_names[i + 1] ? No, we need to allocate the array larger in that case as want proper names for the interrupts. Consider the case of !per_vq_vectors, the size of msix_names is 2, but snprintf can do out of bound accessing here. (We name the msix shared by virtqueues with something like "%s-virtqueues" before the patch). Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/9] virtio: provide a method to get the IRQ affinity mask for a virtqueue
On 2017年01月27日 16:16, Christoph Hellwig wrote: This basically passed up the pci_irq_get_affinity information through virtio through an optional get_vq_affinity method. It is only implemented by the PCI backend for now, and only when we use per-virtqueue IRQs. Signed-off-by: Christoph Hellwig --- Reviewed-by: Jason Wang drivers/virtio/virtio_pci_common.c | 11 +++ drivers/virtio/virtio_pci_common.h | 2 ++ drivers/virtio/virtio_pci_legacy.c | 1 + drivers/virtio/virtio_pci_modern.c | 2 ++ include/linux/virtio_config.h | 3 +++ 5 files changed, 19 insertions(+) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index c244212..31ba259 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -361,6 +361,17 @@ int vp_set_vq_affinity(struct virtqueue *vq, int cpu) return 0; } +const struct cpumask *vp_get_vq_affinity(struct virtio_device *vdev, int index) +{ + struct virtio_pci_device *vp_dev = to_vp_device(vdev); + unsigned int *map = vp_dev->msix_vector_map; + + if (!map || map[index] == VIRTIO_MSI_NO_VECTOR) + return NULL; + + return pci_irq_get_affinity(vp_dev->pci_dev, map[index]); +} + #ifdef CONFIG_PM_SLEEP static int virtio_pci_freeze(struct device *dev) { diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h index a6ad9ec..ac8c9d7 100644 --- a/drivers/virtio/virtio_pci_common.h +++ b/drivers/virtio/virtio_pci_common.h @@ -108,6 +108,8 @@ const char *vp_bus_name(struct virtio_device *vdev); */ int vp_set_vq_affinity(struct virtqueue *vq, int cpu); +const struct cpumask *vp_get_vq_affinity(struct virtio_device *vdev, int index); + #if IS_ENABLED(CONFIG_VIRTIO_PCI_LEGACY) int virtio_pci_legacy_probe(struct virtio_pci_device *); void virtio_pci_legacy_remove(struct virtio_pci_device *); diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c index 2ab6aee..f7362c5 100644 --- a/drivers/virtio/virtio_pci_legacy.c +++ b/drivers/virtio/virtio_pci_legacy.c @@ -190,6 +190,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = { .finalize_features = vp_finalize_features, .bus_name = vp_bus_name, .set_vq_affinity = vp_set_vq_affinity, + .get_vq_affinity = vp_get_vq_affinity, }; /* the PCI probing function */ diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index a7a0981..7bc3004 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -437,6 +437,7 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = { .finalize_features = vp_finalize_features, .bus_name = vp_bus_name, .set_vq_affinity = vp_set_vq_affinity, + .get_vq_affinity = vp_get_vq_affinity, }; static const struct virtio_config_ops virtio_pci_config_ops = { @@ -452,6 +453,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = { .finalize_features = vp_finalize_features, .bus_name = vp_bus_name, .set_vq_affinity = vp_set_vq_affinity, + .get_vq_affinity = vp_get_vq_affinity, }; /** diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 2ebe506..8355bab 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -58,6 +58,7 @@ struct irq_affinity; * This returns a pointer to the bus name a la pci_name from which * the caller can then copy. * @set_vq_affinity: set the affinity for a virtqueue. + * @get_vq_affinity: get the affinity for a virtqueue (optional). */ typedef void vq_callback_t(struct virtqueue *); struct virtio_config_ops { @@ -77,6 +78,8 @@ struct virtio_config_ops { int (*finalize_features)(struct virtio_device *vdev); const char *(*bus_name)(struct virtio_device *vdev); int (*set_vq_affinity)(struct virtqueue *vq, int cpu); + const struct cpumask *(*get_vq_affinity)(struct virtio_device *vdev, + int index); }; /* If driver didn't advertise the feature, it will never appear. */ -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 5/9] virtio: allow drivers to request IRQ affinity when creating VQs
On 2017年01月27日 16:16, Christoph Hellwig wrote: Add a struct irq_affinity pointer to the find_vqs methods, which if set is used to tell the PCI layer to create the MSI-X vectors for our I/O virtqueues with the proper affinity from the start. Compared to after the fact affinity hints this gives us an instantly working setup and allows to allocate the irq descritors node-local and avoid interconnect traffic. Last but not least this will allow blk-mq queues are created based on the interrupt affinity for storage drivers. Signed-off-by: Christoph Hellwig --- Reviewed-by: Jason Wang -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/9] virtio_pci: simplify MSI-X setup
On 2017年01月27日 16:16, Christoph Hellwig wrote: Try to grab the MSI-X vectors early and fall back to the shared one before doing lots of allocations. Signed-off-by: Christoph Hellwig --- Reviewed-by: Jason Wang drivers/virtio/virtio_pci_common.c | 58 +++--- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 9c4ad7d3f..74ff74c0 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -142,32 +142,39 @@ void vp_del_vqs(struct virtio_device *vdev) } static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs, - struct virtqueue *vqs[], - vq_callback_t *callbacks[], - const char * const names[], - bool per_vq_vectors) + struct virtqueue *vqs[], vq_callback_t *callbacks[], + const char * const names[]) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); const char *name = dev_name(&vp_dev->vdev.dev); int i, err = -ENOMEM, allocated_vectors, nvectors; + bool shared = false; u16 msix_vec; - if (per_vq_vectors) { - /* Best option: one for change interrupt, one per vq. */ - nvectors = 1; - for (i = 0; i < nvqs; ++i) - if (callbacks[i]) - ++nvectors; - } else { - /* Second best: one for change, shared for all vqs. */ + nvectors = 1; + for (i = 0; i < nvqs; i++) { + if (callbacks[i]) + nvectors++; + } + + /* Try one vector per queue first. */ + err = pci_alloc_irq_vectors(vp_dev->pci_dev, nvectors, nvectors, + PCI_IRQ_MSIX); + if (err < 0) { + /* Fallback to one vector for config, one shared for queues. */ + shared = true; nvectors = 2; + err = pci_alloc_irq_vectors(vp_dev->pci_dev, nvectors, nvectors, + PCI_IRQ_MSIX); + if (err < 0) + return err; } vp_dev->msix_vectors = nvectors; vp_dev->msix_names = kmalloc_array(nvectors, sizeof(*vp_dev->msix_names), GFP_KERNEL); if (!vp_dev->msix_names) - return err; + goto out_free_irq_vectors; vp_dev->msix_affinity_masks = kcalloc(nvectors, sizeof(*vp_dev->msix_affinity_masks), GFP_KERNEL); @@ -180,18 +187,13 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs, goto out_free_msix_affinity_masks; } - err = pci_alloc_irq_vectors(vp_dev->pci_dev, nvectors, nvectors, - PCI_IRQ_MSIX); - if (err < 0) - goto out_free_msix_affinity_masks; - /* Set the vector used for configuration */ snprintf(vp_dev->msix_names[0], sizeof(*vp_dev->msix_names), "%s-config", name); err = request_irq(pci_irq_vector(vp_dev->pci_dev, 0), vp_config_changed, 0, vp_dev->msix_names[0], vp_dev); if (err) - goto out_free_irq_vectors; + goto out_free_msix_affinity_masks; /* Verify we had enough resources to assign the vector */ if (vp_dev->config_vector(vp_dev, 0) == VIRTIO_MSI_NO_VECTOR) { @@ -241,7 +243,11 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs, } vp_dev->msix_vector_map[i] = msix_vec; - if (per_vq_vectors) + /* +* Use a different vector for each queue if they are available, +* else share the same vector for all VQs. +*/ + if (!shared) allocated_vectors++; } @@ -254,8 +260,6 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs, vp_dev->config_vector(vp_dev, VIRTIO_MSI_NO_VECTOR); out_free_config_irq: free_irq(pci_irq_vector(vp_dev->pci_dev, 0), vp_dev); -out_free_irq_vectors: - pci_free_irq_vectors(vp_dev->pci_dev); out_free_msix_affinity_masks: for (i = 0; i < nvectors; i++) { if (vp_dev->msix_affinity_masks[i]) @@ -264,6 +268,8 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs, kfree(vp_dev->msix_affinity_masks); out_free_msix_names: kfree(vp_dev->msix_names); +out_free_irq_vectors: + pci_free_irq_vectors(vp_dev->pci_dev); return err; } @@ -308,15 +314,9 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs, { int err; - /* Try MSI-X with one vector per
Re: [PATCH 3/9] virtio_pci: don't duplicate the msix_enable flag in struct pci_dev
On 2017年01月27日 16:16, Christoph Hellwig wrote: Signed-off-by: Christoph Hellwig --- Reviewed-by: Jason Wang drivers/virtio/virtio_pci_common.c | 5 ++--- drivers/virtio/virtio_pci_common.h | 2 -- drivers/virtio/virtio_pci_legacy.c | 2 +- drivers/virtio/virtio_pci_modern.c | 2 +- include/uapi/linux/virtio_pci.h| 2 +- 5 files changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 5880e86..9c4ad7d3f 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -125,7 +125,7 @@ void vp_del_vqs(struct virtio_device *vdev) vp_remove_vqs(vdev); - if (vp_dev->msix_enabled) { + if (vp_dev->pci_dev->msix_enabled) { for (i = 0; i < vp_dev->msix_vectors; i++) free_cpumask_var(vp_dev->msix_affinity_masks[i]); @@ -245,7 +245,6 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs, allocated_vectors++; } - vp_dev->msix_enabled = 1; return 0; out_remove_vqs: @@ -341,7 +340,7 @@ int vp_set_vq_affinity(struct virtqueue *vq, int cpu) if (!vq->callback) return -EINVAL; - if (vp_dev->msix_enabled) { + if (vp_dev->pci_dev->msix_enabled) { int vec = vp_dev->msix_vector_map[vq->index]; struct cpumask *mask = vp_dev->msix_affinity_masks[vec]; unsigned int irq = pci_irq_vector(vp_dev->pci_dev, vec); diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h index 8559386..217ca87 100644 --- a/drivers/virtio/virtio_pci_common.h +++ b/drivers/virtio/virtio_pci_common.h @@ -64,8 +64,6 @@ struct virtio_pci_device { /* the IO mapping for the PCI config space */ void __iomem *ioaddr; - /* MSI-X support */ - int msix_enabled; cpumask_var_t *msix_affinity_masks; /* Name strings for interrupts. This size should be enough, * and I'm too lazy to allocate each name separately. */ diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c index 47292da..2ab6aee 100644 --- a/drivers/virtio/virtio_pci_legacy.c +++ b/drivers/virtio/virtio_pci_legacy.c @@ -165,7 +165,7 @@ static void del_vq(struct virtqueue *vq) iowrite16(vq->index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL); - if (vp_dev->msix_enabled) { + if (vp_dev->pci_dev->msix_enabled) { iowrite16(VIRTIO_MSI_NO_VECTOR, vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR); /* Flush the write out to device */ diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index 00e6fc1..e5ce310 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -412,7 +412,7 @@ static void del_vq(struct virtqueue *vq) vp_iowrite16(vq->index, &vp_dev->common->queue_select); - if (vp_dev->msix_enabled) { + if (vp_dev->pci_dev->msix_enabled) { vp_iowrite16(VIRTIO_MSI_NO_VECTOR, &vp_dev->common->queue_msix_vector); /* Flush the write out to device */ diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h index 90007a1..15b4385 100644 --- a/include/uapi/linux/virtio_pci.h +++ b/include/uapi/linux/virtio_pci.h @@ -79,7 +79,7 @@ * configuration space */ #define VIRTIO_PCI_CONFIG_OFF(msix_enabled) ((msix_enabled) ? 24 : 20) /* Deprecated: please use VIRTIO_PCI_CONFIG_OFF instead */ -#define VIRTIO_PCI_CONFIG(dev) VIRTIO_PCI_CONFIG_OFF((dev)->msix_enabled) +#define VIRTIO_PCI_CONFIG(dev) VIRTIO_PCI_CONFIG_OFF((dev)->pci_dev->msix_enabled) /* Virtio ABI version, this must match exactly */ #define VIRTIO_PCI_ABI_VERSION0 -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/9] virtio_pci: remove struct virtio_pci_vq_info
On 2017年01月27日 16:16, Christoph Hellwig wrote: We don't really need struct virtio_pci_vq_info, as most field in there are redundant: - the vq backpointer is not strictly neede to start with - the entry in the vqs list is not needed - the generic virtqueue already has list, we only need to check if it has a callback to get the same semantics - we can use a simple array to look up the MSI-X vec if needed. - That simple array now also duoble serves to replace the per_vq_vectors flag Signed-off-by: Christoph Hellwig --- drivers/virtio/virtio_pci_common.c | 117 +++-- drivers/virtio/virtio_pci_common.h | 25 +--- drivers/virtio/virtio_pci_legacy.c | 6 +- drivers/virtio/virtio_pci_modern.c | 6 +- 4 files changed, 39 insertions(+), 115 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index 186cbab..a3376731 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -62,16 +62,13 @@ static irqreturn_t vp_config_changed(int irq, void *opaque) static irqreturn_t vp_vring_interrupt(int irq, void *opaque) { struct virtio_pci_device *vp_dev = opaque; - struct virtio_pci_vq_info *info; irqreturn_t ret = IRQ_NONE; - unsigned long flags; + struct virtqueue *vq; - spin_lock_irqsave(&vp_dev->lock, flags); - list_for_each_entry(info, &vp_dev->virtqueues, node) { - if (vring_interrupt(irq, info->vq) == IRQ_HANDLED) + list_for_each_entry(vq, &vp_dev->vdev.vqs, list) { + if (vq->callback && vring_interrupt(irq, vq) == IRQ_HANDLED) The check of vq->callback seems redundant, we will check it soon in vring_interrupt(). Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/9] virtio_pci: use shared interrupts for virtqueues
On 2017年01月27日 16:16, Christoph Hellwig wrote: + snprintf(vp_dev->msix_names[i + 1], +sizeof(*vp_dev->msix_names), "%s-%s", dev_name(&vp_dev->vdev.dev), names[i]); err = request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec), - vring_interrupt, 0, - vp_dev->msix_names[msix_vec], - vqs[i]); + vring_interrupt, IRQF_SHARED, + vp_dev->msix_names[i + 1], vqs[i]); Do we need to check per_vq_vectors before dereferencing msix_names[i + 1] ? Thanks -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio_blk: fix panic in initialization error path
On 2017年01月10日 03:44, Omar Sandoval wrote: From: Omar Sandoval If blk_mq_init_queue() returns an error, it gets assigned to vblk->disk->queue. Then, when we call put_disk(), we end up calling blk_put_queue() with the ERR_PTR, causing a bad dereference. Fix it by only assigning to vblk->disk->queue on success. Signed-off-by: Omar Sandoval --- drivers/block/virtio_blk.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 5545a679abd8..8587361e5356 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -628,11 +628,12 @@ static int virtblk_probe(struct virtio_device *vdev) if (err) goto out_put_disk; - q = vblk->disk->queue = blk_mq_init_queue(&vblk->tag_set); + q = blk_mq_init_queue(&vblk->tag_set); if (IS_ERR(q)) { err = -ENOMEM; goto out_free_tags; } + vblk->disk->queue = q; q->queuedata = vblk; Acked-by: Jason Wang -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio_blk: avoid DMA to stack for the sense buffer
On 2017年01月04日 13:25, Christoph Hellwig wrote: Most users of BLOCK_PC requests allocate the sense buffer on the stack, so to avoid DMA to the stack copy them to a field in the heap allocated virtblk_req structure. Without that any attempt at SCSI passthrough I/O, including the SG_IO ioctl from userspace will crash the kernel. Note that this includes running tools like hdparm even when the host does not have SCSI passthrough enabled. Signed-off-by: Christoph Hellwig --- drivers/block/virtio_blk.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 5545a67..3c3b8f6 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -56,6 +56,7 @@ struct virtblk_req { struct virtio_blk_outhdr out_hdr; struct virtio_scsi_inhdr in_hdr; u8 status; + u8 sense[SCSI_SENSE_BUFFERSIZE]; struct scatterlist sg[]; }; @@ -102,7 +103,8 @@ static int __virtblk_add_req(struct virtqueue *vq, } if (type == cpu_to_virtio32(vq->vdev, VIRTIO_BLK_T_SCSI_CMD)) { - sg_init_one(&sense, vbr->req->sense, SCSI_SENSE_BUFFERSIZE); + memcpy(vbr->sense, vbr->req->sense, SCSI_SENSE_BUFFERSIZE); + sg_init_one(&sense, vbr->sense, SCSI_SENSE_BUFFERSIZE); sgs[num_out + num_in++] = &sense; sg_init_one(&inhdr, &vbr->in_hdr, sizeof(vbr->in_hdr)); sgs[num_out + num_in++] = &inhdr; Acked-by: Jason Wang -- To unsubscribe from this list: send the line "unsubscribe linux-block" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html