RE: [PATCH 2/2] iommu/ipmmu-vmsa: Add support for r8a779a0
Hi Geert-san, > From: Geert Uytterhoeven, Sent: Tuesday, September 7, 2021 12:34 AM > > Hi Shimoda-san, > > On Wed, Sep 1, 2021 at 12:27 PM Yoshihiro Shimoda > wrote: > > Add support for r8a779a0 (R-Car V3U). The IPMMU hardware design > > of this SoC differs than others. So, add a new ipmmu_features for it. > > > > Signed-off-by: Yoshihiro Shimoda > > > --- a/drivers/iommu/ipmmu-vmsa.c > > +++ b/drivers/iommu/ipmmu-vmsa.c > > > @@ -922,6 +922,20 @@ static const struct ipmmu_features > > ipmmu_features_rcar_gen3 = { > > .utlb_offset_base = 0, > > }; > > > > +static const struct ipmmu_features ipmmu_features_r8a779a0 = { > > + .use_ns_alias_offset = false, > > + .has_cache_leaf_nodes = true, > > + .number_of_contexts = 8, > > Shouldn't this be 16? > Or do you plan to add support for more than 8 contexts later, as that > would require increasing IPMMU_CTX_MAX, and updating ipmmu_ctx_reg() > to handle the second bank of 8 contexts? I would like to add support for more than 8 contexts later because I realized that ctx_offset_{base,stride} are not suitable for the second bank of 8 contexts... > Regardless, I assume this will still work when when limiting to 8 > contexts, so > Reviewed-by: Geert Uytterhoeven Thank you for your review! Best regards, Yoshihiro Shimoda ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing
On Mon, Sep 6, 2021 at 7:44 PM Robin Murphy wrote: > > On 2021-08-05 17:03, Lorenzo Pieralisi wrote: > > On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote: > > > > [...] > > > >> +static void __init iort_node_get_rmr_info(struct acpi_iort_node > >> *iort_node) > >> +{ > >> +struct acpi_iort_node *smmu; > >> +struct acpi_iort_rmr *rmr; > >> +struct acpi_iort_rmr_desc *rmr_desc; > >> +u32 map_count = iort_node->mapping_count; > >> +u32 sid; > >> +int i; > >> + > >> +if (!iort_node->mapping_offset || map_count != 1) { > >> +pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n", > >> + iort_node); > >> +return; > >> +} > >> + > >> +/* Retrieve associated smmu and stream id */ > >> +smmu = iort_node_get_id(iort_node, , 0); > >> +if (!smmu) { > >> +pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node > >> %p\n", > >> + iort_node); > >> +return; > >> +} > >> + > >> +/* Retrieve RMR data */ > >> +rmr = (struct acpi_iort_rmr *)iort_node->node_data; > >> +if (!rmr->rmr_offset || !rmr->rmr_count) { > >> +pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR > >> node %p\n", > >> + iort_node); > >> +return; > >> +} > >> + > >> +rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node, > >> +rmr->rmr_offset); > >> + > >> +iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count); > >> + > >> +for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) { > >> +struct iommu_resv_region *region; > >> +enum iommu_resv_type type; > >> +int prot = IOMMU_READ | IOMMU_WRITE; > >> +u64 addr = rmr_desc->base_address, size = rmr_desc->length; > >> + > >> +if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) { > >> +/* PAGE align base addr and size */ > >> +addr &= PAGE_MASK; > >> +size = PAGE_ALIGN(size + > >> offset_in_page(rmr_desc->base_address)); > >> + > >> +pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not > >> aligned to 64K, continue with [0x%llx - 0x%llx]\n", > >> + rmr_desc->base_address, > >> + rmr_desc->base_address + rmr_desc->length - 1, > >> + addr, addr + size - 1); > >> +} > >> +if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) { > >> +type = IOMMU_RESV_DIRECT_RELAXABLE; > >> +/* > >> + * Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is > >> + * normally used for allocated system memory that is > >> + * then used for device specific reserved regions. > >> + */ > >> +prot |= IOMMU_CACHE; > >> +} else { > >> +type = IOMMU_RESV_DIRECT; > >> +/* > >> + * Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally > >> used > >> + * for device memory like MSI doorbell. > >> + */ > >> +prot |= IOMMU_MMIO; > >> +} > > > > On the prot value assignment based on the remapping flag, I'd like to > > hear Robin/Joerg's opinion, I'd avoid being in a situation where > > "normally" this would work but then we have to quirk it. > > > > Is this a valid assumption _always_ ? > > No. Certainly applying IOMMU_CACHE without reference to the device's > _CCA attribute or how CPUs may be accessing a shared buffer could lead > to a loss of coherency. At worst, applying IOMMU_MMIO to a > device-private buffer *could* cause the device to lose coherency with > itself if the memory underlying the RMR may have allocated into system > caches. Note that the expected use for non-remappable RMRs is the device > holding some sort of long-lived private data in system RAM - the MSI > doorbell trick is far more of a niche hack really. > > At the very least I think we need to refer to the device's memory access > properties here. > > Jon, Laurentiu - how do RMRs correspond to the EFI memory map on your > firmware? I'm starting to think that as long as the underlying memory is > described appropriately there then we should be able to infer correct > attributes from the EFI memory type and flags. The devices are all cache coherent and marked as _CCA, 1. The Memory regions are in the virt table as ARM_MEMORY_REGION_ATTRIBUTE_DEVICE. The current chicken and egg problem we have is that during the fsl-mc-bus initialization we call error = acpi_dma_configure_id(>dev, DEV_DMA_COHERENT, _stream_id); which gets deferred because the SMMU has not been initialized yet. Then we initialize the RMR tables but there is no device reference there to be able to
Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing
On 2021-08-05 17:03, Lorenzo Pieralisi wrote: On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote: [...] +static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node) +{ + struct acpi_iort_node *smmu; + struct acpi_iort_rmr *rmr; + struct acpi_iort_rmr_desc *rmr_desc; + u32 map_count = iort_node->mapping_count; + u32 sid; + int i; + + if (!iort_node->mapping_offset || map_count != 1) { + pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n", + iort_node); + return; + } + + /* Retrieve associated smmu and stream id */ + smmu = iort_node_get_id(iort_node, , 0); + if (!smmu) { + pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node %p\n", + iort_node); + return; + } + + /* Retrieve RMR data */ + rmr = (struct acpi_iort_rmr *)iort_node->node_data; + if (!rmr->rmr_offset || !rmr->rmr_count) { + pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR node %p\n", + iort_node); + return; + } + + rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node, + rmr->rmr_offset); + + iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count); + + for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) { + struct iommu_resv_region *region; + enum iommu_resv_type type; + int prot = IOMMU_READ | IOMMU_WRITE; + u64 addr = rmr_desc->base_address, size = rmr_desc->length; + + if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) { + /* PAGE align base addr and size */ + addr &= PAGE_MASK; + size = PAGE_ALIGN(size + offset_in_page(rmr_desc->base_address)); + + pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not aligned to 64K, continue with [0x%llx - 0x%llx]\n", + rmr_desc->base_address, + rmr_desc->base_address + rmr_desc->length - 1, + addr, addr + size - 1); + } + if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) { + type = IOMMU_RESV_DIRECT_RELAXABLE; + /* +* Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is +* normally used for allocated system memory that is +* then used for device specific reserved regions. +*/ + prot |= IOMMU_CACHE; + } else { + type = IOMMU_RESV_DIRECT; + /* +* Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally used +* for device memory like MSI doorbell. +*/ + prot |= IOMMU_MMIO; + } On the prot value assignment based on the remapping flag, I'd like to hear Robin/Joerg's opinion, I'd avoid being in a situation where "normally" this would work but then we have to quirk it. Is this a valid assumption _always_ ? No. Certainly applying IOMMU_CACHE without reference to the device's _CCA attribute or how CPUs may be accessing a shared buffer could lead to a loss of coherency. At worst, applying IOMMU_MMIO to a device-private buffer *could* cause the device to lose coherency with itself if the memory underlying the RMR may have allocated into system caches. Note that the expected use for non-remappable RMRs is the device holding some sort of long-lived private data in system RAM - the MSI doorbell trick is far more of a niche hack really. At the very least I think we need to refer to the device's memory access properties here. Jon, Laurentiu - how do RMRs correspond to the EFI memory map on your firmware? I'm starting to think that as long as the underlying memory is described appropriately there then we should be able to infer correct attributes from the EFI memory type and flags. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/2] iommu/ipmmu-vmsa: Add support for r8a779a0
Hi Shimoda-san, On Wed, Sep 1, 2021 at 12:27 PM Yoshihiro Shimoda wrote: > Add support for r8a779a0 (R-Car V3U). The IPMMU hardware design > of this SoC differs than others. So, add a new ipmmu_features for it. > > Signed-off-by: Yoshihiro Shimoda > --- a/drivers/iommu/ipmmu-vmsa.c > +++ b/drivers/iommu/ipmmu-vmsa.c > @@ -922,6 +922,20 @@ static const struct ipmmu_features > ipmmu_features_rcar_gen3 = { > .utlb_offset_base = 0, > }; > > +static const struct ipmmu_features ipmmu_features_r8a779a0 = { > + .use_ns_alias_offset = false, > + .has_cache_leaf_nodes = true, > + .number_of_contexts = 8, Shouldn't this be 16? Or do you plan to add support for more than 8 contexts later, as that would require increasing IPMMU_CTX_MAX, and updating ipmmu_ctx_reg() to handle the second bank of 8 contexts? Regardless, I assume this will still work when when limiting to 8 contexts, so Reviewed-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] dt-bindings: iommu: renesas, ipmmu-vmsa: add r8a779a0 support
On Wed, Sep 1, 2021 at 12:27 PM Yoshihiro Shimoda wrote: > Add support for r8a779a0 (R-Car V3U). > > Signed-off-by: Yoshihiro Shimoda Reviewed-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops
On Mon, Sep 6, 2021 at 6:43 PM Michael S. Tsirkin wrote: > > On Mon, Sep 06, 2021 at 04:45:55PM +0800, Yongji Xie wrote: > > On Mon, Sep 6, 2021 at 4:01 PM Michael S. Tsirkin wrote: > > > > > > On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote: > > > > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin > > > > wrote: > > > > > > > > > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote: > > > > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin > > > > > > wrote: > > > > > > > > > > > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote: > > > > > > > > This adds a new callback to support device specific reset > > > > > > > > behavior. The vdpa bus driver will call the reset function > > > > > > > > instead of setting status to zero during resetting. > > > > > > > > > > > > > > > > Signed-off-by: Xie Yongji > > > > > > > > > > > > > > > > > > > > > This does gloss over a significant change though: > > > > > > > > > > > > > > > > > > > > > > --- > > > > > > > > @@ -348,12 +352,12 @@ static inline struct device > > > > > > > > *vdpa_get_dma_dev(struct vdpa_device *vdev) > > > > > > > > return vdev->dma_dev; > > > > > > > > } > > > > > > > > > > > > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev) > > > > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev) > > > > > > > > { > > > > > > > > const struct vdpa_config_ops *ops = vdev->config; > > > > > > > > > > > > > > > > vdev->features_valid = false; > > > > > > > > - ops->set_status(vdev, 0); > > > > > > > > + return ops->reset(vdev); > > > > > > > > } > > > > > > > > > > > > > > > > static inline int vdpa_set_features(struct vdpa_device *vdev, > > > > > > > > u64 features) > > > > > > > > > > > > > > > > > > > > > Unfortunately this breaks virtio_vdpa: > > > > > > > > > > > > > > > > > > > > > static void virtio_vdpa_reset(struct virtio_device *vdev) > > > > > > > { > > > > > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev); > > > > > > > > > > > > > > vdpa_reset(vdpa); > > > > > > > } > > > > > > > > > > > > > > > > > > > > > and there's no easy way to fix this, kernel can't recover > > > > > > > from a reset failure e.g. during driver unbind. > > > > > > > > > > > > > > > > > > > Yes, but it should be safe with the protection of software IOTLB > > > > > > even > > > > > > if the reset() fails during driver unbind. > > > > > > > > > > > > Thanks, > > > > > > Yongji > > > > > > > > > > Hmm. I don't see it. > > > > > What exactly will happen? What prevents device from poking at > > > > > memory after reset? Note that dma unmap in e.g. del_vqs happens > > > > > too late. > > > > > > > > But I didn't see any problems with touching the memory for virtqueues. > > > > > > Drivers make the assumption that after reset returns no new > > > buffers will be consumed. For example a bunch of drivers > > > call virtqueue_detach_unused_buf. > > > > I'm not sure if I get your point. But it looks like > > virtqueue_detach_unused_buf() will check the driver's metadata first > > rather than read the memory from virtqueue. > > > > > I can't say whether block makes this assumption anywhere. > > > Needs careful auditing. > > > > > > > The memory should not be freed after dma unmap? > > > > > > But unmap does not happen until after the reset. > > > > > > > I mean the memory is totally allocated and controlled by the VDUSE > > driver. The VDUSE driver will not return them to the buddy system > > unless userspace unmap it. > > Right. But what stops VDUSE from poking at memory after > reset failed? > Only itself. But in normal cases, userspace would not poke at the memory since there is no available data after reset. And it makes me think whether it's better to disallow returning errors from userspace for the reset message. Then the only case that leads to reset failure is the request timeout, which will mark the device broken now. > > > > > > > > And the memory for the bounce buffer should also be safe to be > > > > accessed by userspace in this case. > > > > > > > > > And what about e.g. interrupts? > > > > > E.g. we have this: > > > > > > > > > > /* Virtqueues are stopped, nothing can use vblk->vdev > > > > > anymore. */ > > > > > vblk->vdev = NULL; > > > > > > > > > > and this is no longer true at this point. > > > > > > > > > > > > > You're right. But I didn't see where the interrupt handler will use > > > > the vblk->vdev. > > > > > > static void virtblk_done(struct virtqueue *vq) > > > { > > > struct virtio_blk *vblk = vq->vdev->priv; > > > > > > vq->vdev is the same as vblk->vdev. > > > > > > > We will test the vq->ready (will be set to false in del_vqs()) before > > injecting an interrupt in the VDUSE driver. So it should be OK? > > Maybe not ... It's not designed for such asynchronous access, so e.g. > there's no locking or memory ordering around accesses. > Yes, so we still need the below fix. > > > > > > > > So it seems to be
Re: [git pull] IOMMU Updates for Linux v5.15
On 2021-09-03 22:44, Joerg Roedel wrote: On Fri, Sep 03, 2021 at 11:43:31AM -0700, Linus Torvalds wrote: choice prompt "IOMMU default domain type" depends on IOMMU_API default IOMMU_DEFAULT_DMA_LAZY if AMD_IOMMU || INTEL_IOMMU default IOMMU_DEFAULT_DMA_STRICT Huh, yeah, that is bogus. Seems like I overlooked that part of the patch-set because I was so amazed by the simplifications and cleanups in the rest of it. Mad as it looks, this does in fact capture the existing behaviour. What we've consolidated here was previously a weird mix of driver- and subsystem-level controls, and it is specifically those two drivers which have a long-standing expectation of using lazy behaviour by default. See what I'm saying? Making the default be based on some random "this driver is enabled" when it can then affect *other* drivers that are also enabled and not part of the decision seems to be a fundamental confusion about what is going on, when it's not at all clear which driver will actually be IN USE. The Kconfig option itself was actually my suggestion, but how the default value is chosen certainly needs improvement. I will sort this out with the people involved. IOW, the fix might be to just say "the default is always lazy". Or the fix might be something that is global to a configuration and doesn't rely on which iommu is in use (eg "on x86, the default is always LAZY") We could certainly express it as "default IOMMU_DEFAULT_DMA_LAZY if X86" if people would prefer - virtio-iommu doesn't support lazy mode either way, so the end result will still be equivalent. Robin. Or the fix is to make that 'iommu_dma_strict' variable - and the default value for it - be a per-IOMMU thing rather than be a global. My preference would be to make 'lazy' or 'strict' the default for all, but the ARM folks might disagree. On the other side it also doesn't make sense to let IOMMU drivers override the users Kconfig choice at runtime. We will discuss this and come up with something better. > Thanks, Joerg ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops
On Mon, Sep 06, 2021 at 04:45:55PM +0800, Yongji Xie wrote: > On Mon, Sep 6, 2021 at 4:01 PM Michael S. Tsirkin wrote: > > > > On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote: > > > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin wrote: > > > > > > > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote: > > > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin > > > > > wrote: > > > > > > > > > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote: > > > > > > > This adds a new callback to support device specific reset > > > > > > > behavior. The vdpa bus driver will call the reset function > > > > > > > instead of setting status to zero during resetting. > > > > > > > > > > > > > > Signed-off-by: Xie Yongji > > > > > > > > > > > > > > > > > > This does gloss over a significant change though: > > > > > > > > > > > > > > > > > > > --- > > > > > > > @@ -348,12 +352,12 @@ static inline struct device > > > > > > > *vdpa_get_dma_dev(struct vdpa_device *vdev) > > > > > > > return vdev->dma_dev; > > > > > > > } > > > > > > > > > > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev) > > > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev) > > > > > > > { > > > > > > > const struct vdpa_config_ops *ops = vdev->config; > > > > > > > > > > > > > > vdev->features_valid = false; > > > > > > > - ops->set_status(vdev, 0); > > > > > > > + return ops->reset(vdev); > > > > > > > } > > > > > > > > > > > > > > static inline int vdpa_set_features(struct vdpa_device *vdev, > > > > > > > u64 features) > > > > > > > > > > > > > > > > > > Unfortunately this breaks virtio_vdpa: > > > > > > > > > > > > > > > > > > static void virtio_vdpa_reset(struct virtio_device *vdev) > > > > > > { > > > > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev); > > > > > > > > > > > > vdpa_reset(vdpa); > > > > > > } > > > > > > > > > > > > > > > > > > and there's no easy way to fix this, kernel can't recover > > > > > > from a reset failure e.g. during driver unbind. > > > > > > > > > > > > > > > > Yes, but it should be safe with the protection of software IOTLB even > > > > > if the reset() fails during driver unbind. > > > > > > > > > > Thanks, > > > > > Yongji > > > > > > > > Hmm. I don't see it. > > > > What exactly will happen? What prevents device from poking at > > > > memory after reset? Note that dma unmap in e.g. del_vqs happens > > > > too late. > > > > > > But I didn't see any problems with touching the memory for virtqueues. > > > > Drivers make the assumption that after reset returns no new > > buffers will be consumed. For example a bunch of drivers > > call virtqueue_detach_unused_buf. > > I'm not sure if I get your point. But it looks like > virtqueue_detach_unused_buf() will check the driver's metadata first > rather than read the memory from virtqueue. > > > I can't say whether block makes this assumption anywhere. > > Needs careful auditing. > > > > > The memory should not be freed after dma unmap? > > > > But unmap does not happen until after the reset. > > > > I mean the memory is totally allocated and controlled by the VDUSE > driver. The VDUSE driver will not return them to the buddy system > unless userspace unmap it. Right. But what stops VDUSE from poking at memory after reset failed? > > > > > And the memory for the bounce buffer should also be safe to be > > > accessed by userspace in this case. > > > > > > > And what about e.g. interrupts? > > > > E.g. we have this: > > > > > > > > /* Virtqueues are stopped, nothing can use vblk->vdev anymore. > > > > */ > > > > vblk->vdev = NULL; > > > > > > > > and this is no longer true at this point. > > > > > > > > > > You're right. But I didn't see where the interrupt handler will use > > > the vblk->vdev. > > > > static void virtblk_done(struct virtqueue *vq) > > { > > struct virtio_blk *vblk = vq->vdev->priv; > > > > vq->vdev is the same as vblk->vdev. > > > > We will test the vq->ready (will be set to false in del_vqs()) before > injecting an interrupt in the VDUSE driver. So it should be OK? Maybe not ... It's not designed for such asynchronous access, so e.g. there's no locking or memory ordering around accesses. > > > > > So it seems to be not too late to fix it: > > > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c > > > b/drivers/vdpa/vdpa_user/vduse_dev.c > > > index 5c25ff6483ad..ea41a7389a26 100644 > > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c > > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c > > > @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct > > > vdpa_device *vdpa, unsigned int offset, > > > static int vduse_vdpa_reset(struct vdpa_device *vdpa) > > > { > > > struct vduse_dev *dev = vdpa_to_vduse(vdpa); > > > + int ret; > > > > > > - if (vduse_dev_set_status(dev, 0)) > > > - return -EIO; > > > + ret = vduse_dev_set_status(dev, 0); > > > > > >
Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops
On Mon, Sep 6, 2021 at 4:01 PM Michael S. Tsirkin wrote: > > On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote: > > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin wrote: > > > > > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote: > > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin > > > > wrote: > > > > > > > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote: > > > > > > This adds a new callback to support device specific reset > > > > > > behavior. The vdpa bus driver will call the reset function > > > > > > instead of setting status to zero during resetting. > > > > > > > > > > > > Signed-off-by: Xie Yongji > > > > > > > > > > > > > > > This does gloss over a significant change though: > > > > > > > > > > > > > > > > --- > > > > > > @@ -348,12 +352,12 @@ static inline struct device > > > > > > *vdpa_get_dma_dev(struct vdpa_device *vdev) > > > > > > return vdev->dma_dev; > > > > > > } > > > > > > > > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev) > > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev) > > > > > > { > > > > > > const struct vdpa_config_ops *ops = vdev->config; > > > > > > > > > > > > vdev->features_valid = false; > > > > > > - ops->set_status(vdev, 0); > > > > > > + return ops->reset(vdev); > > > > > > } > > > > > > > > > > > > static inline int vdpa_set_features(struct vdpa_device *vdev, u64 > > > > > > features) > > > > > > > > > > > > > > > Unfortunately this breaks virtio_vdpa: > > > > > > > > > > > > > > > static void virtio_vdpa_reset(struct virtio_device *vdev) > > > > > { > > > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev); > > > > > > > > > > vdpa_reset(vdpa); > > > > > } > > > > > > > > > > > > > > > and there's no easy way to fix this, kernel can't recover > > > > > from a reset failure e.g. during driver unbind. > > > > > > > > > > > > > Yes, but it should be safe with the protection of software IOTLB even > > > > if the reset() fails during driver unbind. > > > > > > > > Thanks, > > > > Yongji > > > > > > Hmm. I don't see it. > > > What exactly will happen? What prevents device from poking at > > > memory after reset? Note that dma unmap in e.g. del_vqs happens > > > too late. > > > > But I didn't see any problems with touching the memory for virtqueues. > > Drivers make the assumption that after reset returns no new > buffers will be consumed. For example a bunch of drivers > call virtqueue_detach_unused_buf. I'm not sure if I get your point. But it looks like virtqueue_detach_unused_buf() will check the driver's metadata first rather than read the memory from virtqueue. > I can't say whether block makes this assumption anywhere. > Needs careful auditing. > > > The memory should not be freed after dma unmap? > > But unmap does not happen until after the reset. > I mean the memory is totally allocated and controlled by the VDUSE driver. The VDUSE driver will not return them to the buddy system unless userspace unmap it. > > > And the memory for the bounce buffer should also be safe to be > > accessed by userspace in this case. > > > > > And what about e.g. interrupts? > > > E.g. we have this: > > > > > > /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */ > > > vblk->vdev = NULL; > > > > > > and this is no longer true at this point. > > > > > > > You're right. But I didn't see where the interrupt handler will use > > the vblk->vdev. > > static void virtblk_done(struct virtqueue *vq) > { > struct virtio_blk *vblk = vq->vdev->priv; > > vq->vdev is the same as vblk->vdev. > We will test the vq->ready (will be set to false in del_vqs()) before injecting an interrupt in the VDUSE driver. So it should be OK? > > > So it seems to be not too late to fix it: > > > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c > > b/drivers/vdpa/vdpa_user/vduse_dev.c > > index 5c25ff6483ad..ea41a7389a26 100644 > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c > > @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct > > vdpa_device *vdpa, unsigned int offset, > > static int vduse_vdpa_reset(struct vdpa_device *vdpa) > > { > > struct vduse_dev *dev = vdpa_to_vduse(vdpa); > > + int ret; > > > > - if (vduse_dev_set_status(dev, 0)) > > - return -EIO; > > + ret = vduse_dev_set_status(dev, 0); > > > > vduse_dev_reset(dev); > > > > - return 0; > > + return ret; > > } > > > > static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa) > > > > Thanks, > > Yongji > > Needs some comments to explain why it's done like this. > This is used to make sure the userspace can't not inject the interrupt any more after reset. The vduse_dev_reset() will clear the interrupt callback and flush the irq kworker. > BTW device is generally wedged at this point right? > E.g. if reset during initialization fails, userspace > will still get
Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops
On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote: > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin wrote: > > > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote: > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin wrote: > > > > > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote: > > > > > This adds a new callback to support device specific reset > > > > > behavior. The vdpa bus driver will call the reset function > > > > > instead of setting status to zero during resetting. > > > > > > > > > > Signed-off-by: Xie Yongji > > > > > > > > > > > > This does gloss over a significant change though: > > > > > > > > > > > > > --- > > > > > @@ -348,12 +352,12 @@ static inline struct device > > > > > *vdpa_get_dma_dev(struct vdpa_device *vdev) > > > > > return vdev->dma_dev; > > > > > } > > > > > > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev) > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev) > > > > > { > > > > > const struct vdpa_config_ops *ops = vdev->config; > > > > > > > > > > vdev->features_valid = false; > > > > > - ops->set_status(vdev, 0); > > > > > + return ops->reset(vdev); > > > > > } > > > > > > > > > > static inline int vdpa_set_features(struct vdpa_device *vdev, u64 > > > > > features) > > > > > > > > > > > > Unfortunately this breaks virtio_vdpa: > > > > > > > > > > > > static void virtio_vdpa_reset(struct virtio_device *vdev) > > > > { > > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev); > > > > > > > > vdpa_reset(vdpa); > > > > } > > > > > > > > > > > > and there's no easy way to fix this, kernel can't recover > > > > from a reset failure e.g. during driver unbind. > > > > > > > > > > Yes, but it should be safe with the protection of software IOTLB even > > > if the reset() fails during driver unbind. > > > > > > Thanks, > > > Yongji > > > > Hmm. I don't see it. > > What exactly will happen? What prevents device from poking at > > memory after reset? Note that dma unmap in e.g. del_vqs happens > > too late. > > But I didn't see any problems with touching the memory for virtqueues. Drivers make the assumption that after reset returns no new buffers will be consumed. For example a bunch of drivers call virtqueue_detach_unused_buf. I can't say whether block makes this assumption anywhere. Needs careful auditing. > The memory should not be freed after dma unmap? But unmap does not happen until after the reset. > And the memory for the bounce buffer should also be safe to be > accessed by userspace in this case. > > > And what about e.g. interrupts? > > E.g. we have this: > > > > /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */ > > vblk->vdev = NULL; > > > > and this is no longer true at this point. > > > > You're right. But I didn't see where the interrupt handler will use > the vblk->vdev. static void virtblk_done(struct virtqueue *vq) { struct virtio_blk *vblk = vq->vdev->priv; vq->vdev is the same as vblk->vdev. > So it seems to be not too late to fix it: > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c > b/drivers/vdpa/vdpa_user/vduse_dev.c > index 5c25ff6483ad..ea41a7389a26 100644 > --- a/drivers/vdpa/vdpa_user/vduse_dev.c > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c > @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct > vdpa_device *vdpa, unsigned int offset, > static int vduse_vdpa_reset(struct vdpa_device *vdpa) > { > struct vduse_dev *dev = vdpa_to_vduse(vdpa); > + int ret; > > - if (vduse_dev_set_status(dev, 0)) > - return -EIO; > + ret = vduse_dev_set_status(dev, 0); > > vduse_dev_reset(dev); > > - return 0; > + return ret; > } > > static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa) > > Thanks, > Yongji Needs some comments to explain why it's done like this. BTW device is generally wedged at this point right? E.g. if reset during initialization fails, userspace will still get the reset at some later point and be confused ... -- MST ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops
On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin wrote: > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote: > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin wrote: > > > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote: > > > > This adds a new callback to support device specific reset > > > > behavior. The vdpa bus driver will call the reset function > > > > instead of setting status to zero during resetting. > > > > > > > > Signed-off-by: Xie Yongji > > > > > > > > > This does gloss over a significant change though: > > > > > > > > > > --- > > > > @@ -348,12 +352,12 @@ static inline struct device > > > > *vdpa_get_dma_dev(struct vdpa_device *vdev) > > > > return vdev->dma_dev; > > > > } > > > > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev) > > > > +static inline int vdpa_reset(struct vdpa_device *vdev) > > > > { > > > > const struct vdpa_config_ops *ops = vdev->config; > > > > > > > > vdev->features_valid = false; > > > > - ops->set_status(vdev, 0); > > > > + return ops->reset(vdev); > > > > } > > > > > > > > static inline int vdpa_set_features(struct vdpa_device *vdev, u64 > > > > features) > > > > > > > > > Unfortunately this breaks virtio_vdpa: > > > > > > > > > static void virtio_vdpa_reset(struct virtio_device *vdev) > > > { > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev); > > > > > > vdpa_reset(vdpa); > > > } > > > > > > > > > and there's no easy way to fix this, kernel can't recover > > > from a reset failure e.g. during driver unbind. > > > > > > > Yes, but it should be safe with the protection of software IOTLB even > > if the reset() fails during driver unbind. > > > > Thanks, > > Yongji > > Hmm. I don't see it. > What exactly will happen? What prevents device from poking at > memory after reset? Note that dma unmap in e.g. del_vqs happens > too late. But I didn't see any problems with touching the memory for virtqueues. The memory should not be freed after dma unmap? And the memory for the bounce buffer should also be safe to be accessed by userspace in this case. > And what about e.g. interrupts? > E.g. we have this: > > /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */ > vblk->vdev = NULL; > > and this is no longer true at this point. > You're right. But I didn't see where the interrupt handler will use the vblk->vdev. So it seems to be not too late to fix it: diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c index 5c25ff6483ad..ea41a7389a26 100644 --- a/drivers/vdpa/vdpa_user/vduse_dev.c +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct vdpa_device *vdpa, unsigned int offset, static int vduse_vdpa_reset(struct vdpa_device *vdpa) { struct vduse_dev *dev = vdpa_to_vduse(vdpa); + int ret; - if (vduse_dev_set_status(dev, 0)) - return -EIO; + ret = vduse_dev_set_status(dev, 0); vduse_dev_reset(dev); - return 0; + return ret; } static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa) Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops
On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote: > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin wrote: > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote: > > > This adds a new callback to support device specific reset > > > behavior. The vdpa bus driver will call the reset function > > > instead of setting status to zero during resetting. > > > > > > Signed-off-by: Xie Yongji > > > > > > This does gloss over a significant change though: > > > > > > > --- > > > @@ -348,12 +352,12 @@ static inline struct device > > > *vdpa_get_dma_dev(struct vdpa_device *vdev) > > > return vdev->dma_dev; > > > } > > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev) > > > +static inline int vdpa_reset(struct vdpa_device *vdev) > > > { > > > const struct vdpa_config_ops *ops = vdev->config; > > > > > > vdev->features_valid = false; > > > - ops->set_status(vdev, 0); > > > + return ops->reset(vdev); > > > } > > > > > > static inline int vdpa_set_features(struct vdpa_device *vdev, u64 > > > features) > > > > > > Unfortunately this breaks virtio_vdpa: > > > > > > static void virtio_vdpa_reset(struct virtio_device *vdev) > > { > > struct vdpa_device *vdpa = vd_get_vdpa(vdev); > > > > vdpa_reset(vdpa); > > } > > > > > > and there's no easy way to fix this, kernel can't recover > > from a reset failure e.g. during driver unbind. > > > > Yes, but it should be safe with the protection of software IOTLB even > if the reset() fails during driver unbind. > > Thanks, > Yongji Hmm. I don't see it. What exactly will happen? What prevents device from poking at memory after reset? Note that dma unmap in e.g. del_vqs happens too late. And what about e.g. interrupts? E.g. we have this: /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */ vblk->vdev = NULL; and this is no longer true at this point. -- MST ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops
On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin wrote: > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote: > > This adds a new callback to support device specific reset > > behavior. The vdpa bus driver will call the reset function > > instead of setting status to zero during resetting. > > > > Signed-off-by: Xie Yongji > > > This does gloss over a significant change though: > > > > --- > > @@ -348,12 +352,12 @@ static inline struct device *vdpa_get_dma_dev(struct > > vdpa_device *vdev) > > return vdev->dma_dev; > > } > > > > -static inline void vdpa_reset(struct vdpa_device *vdev) > > +static inline int vdpa_reset(struct vdpa_device *vdev) > > { > > const struct vdpa_config_ops *ops = vdev->config; > > > > vdev->features_valid = false; > > - ops->set_status(vdev, 0); > > + return ops->reset(vdev); > > } > > > > static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features) > > > Unfortunately this breaks virtio_vdpa: > > > static void virtio_vdpa_reset(struct virtio_device *vdev) > { > struct vdpa_device *vdpa = vd_get_vdpa(vdev); > > vdpa_reset(vdpa); > } > > > and there's no easy way to fix this, kernel can't recover > from a reset failure e.g. during driver unbind. > Yes, but it should be safe with the protection of software IOTLB even if the reset() fails during driver unbind. Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu