RE: [PATCH 2/2] iommu/ipmmu-vmsa: Add support for r8a779a0

2021-09-06 Thread Yoshihiro Shimoda
Hi Geert-san,

> From: Geert Uytterhoeven, Sent: Tuesday, September 7, 2021 12:34 AM
> 
> Hi Shimoda-san,
> 
> On Wed, Sep 1, 2021 at 12:27 PM Yoshihiro Shimoda
>  wrote:
> > Add support for r8a779a0 (R-Car V3U). The IPMMU hardware design
> > of this SoC differs than others. So, add a new ipmmu_features for it.
> >
> > Signed-off-by: Yoshihiro Shimoda 
> 
> > --- a/drivers/iommu/ipmmu-vmsa.c
> > +++ b/drivers/iommu/ipmmu-vmsa.c
> 
> > @@ -922,6 +922,20 @@ static const struct ipmmu_features 
> > ipmmu_features_rcar_gen3 = {
> > .utlb_offset_base = 0,
> >  };
> >
> > +static const struct ipmmu_features ipmmu_features_r8a779a0 = {
> > +   .use_ns_alias_offset = false,
> > +   .has_cache_leaf_nodes = true,
> > +   .number_of_contexts = 8,
> 
> Shouldn't this be 16?
> Or do you plan to add support for more than 8 contexts later, as that
> would require increasing IPMMU_CTX_MAX, and updating ipmmu_ctx_reg()
> to handle the second bank of 8 contexts?

I would like to add support for more than 8 contexts later because
I realized that ctx_offset_{base,stride} are not suitable for the second bank
of 8 contexts...

> Regardless, I assume this will still work when when limiting to 8
> contexts, so
> Reviewed-by: Geert Uytterhoeven 

Thank you for your review!

Best regards,
Yoshihiro Shimoda

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing

2021-09-06 Thread Jon Nettleton
On Mon, Sep 6, 2021 at 7:44 PM Robin Murphy  wrote:
>
> On 2021-08-05 17:03, Lorenzo Pieralisi wrote:
> > On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote:
> >
> > [...]
> >
> >> +static void __init iort_node_get_rmr_info(struct acpi_iort_node 
> >> *iort_node)
> >> +{
> >> +struct acpi_iort_node *smmu;
> >> +struct acpi_iort_rmr *rmr;
> >> +struct acpi_iort_rmr_desc *rmr_desc;
> >> +u32 map_count = iort_node->mapping_count;
> >> +u32 sid;
> >> +int i;
> >> +
> >> +if (!iort_node->mapping_offset || map_count != 1) {
> >> +pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n",
> >> +   iort_node);
> >> +return;
> >> +}
> >> +
> >> +/* Retrieve associated smmu and stream id */
> >> +smmu = iort_node_get_id(iort_node, , 0);
> >> +if (!smmu) {
> >> +pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node 
> >> %p\n",
> >> +   iort_node);
> >> +return;
> >> +}
> >> +
> >> +/* Retrieve RMR data */
> >> +rmr = (struct acpi_iort_rmr *)iort_node->node_data;
> >> +if (!rmr->rmr_offset || !rmr->rmr_count) {
> >> +pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR 
> >> node %p\n",
> >> +   iort_node);
> >> +return;
> >> +}
> >> +
> >> +rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node,
> >> +rmr->rmr_offset);
> >> +
> >> +iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count);
> >> +
> >> +for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) {
> >> +struct iommu_resv_region *region;
> >> +enum iommu_resv_type type;
> >> +int prot = IOMMU_READ | IOMMU_WRITE;
> >> +u64 addr = rmr_desc->base_address, size = rmr_desc->length;
> >> +
> >> +if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) {
> >> +/* PAGE align base addr and size */
> >> +addr &= PAGE_MASK;
> >> +size = PAGE_ALIGN(size + 
> >> offset_in_page(rmr_desc->base_address));
> >> +
> >> +pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not 
> >> aligned to 64K, continue with [0x%llx - 0x%llx]\n",
> >> +   rmr_desc->base_address,
> >> +   rmr_desc->base_address + rmr_desc->length - 1,
> >> +   addr, addr + size - 1);
> >> +}
> >> +if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) {
> >> +type = IOMMU_RESV_DIRECT_RELAXABLE;
> >> +/*
> >> + * Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is
> >> + * normally used for allocated system memory that is
> >> + * then used for device specific reserved regions.
> >> + */
> >> +prot |= IOMMU_CACHE;
> >> +} else {
> >> +type = IOMMU_RESV_DIRECT;
> >> +/*
> >> + * Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally 
> >> used
> >> + * for device memory like MSI doorbell.
> >> + */
> >> +prot |= IOMMU_MMIO;
> >> +}
> >
> > On the prot value assignment based on the remapping flag, I'd like to
> > hear Robin/Joerg's opinion, I'd avoid being in a situation where
> > "normally" this would work but then we have to quirk it.
> >
> > Is this a valid assumption _always_ ?
>
> No. Certainly applying IOMMU_CACHE without reference to the device's
> _CCA attribute or how CPUs may be accessing a shared buffer could lead
> to a loss of coherency. At worst, applying IOMMU_MMIO to a
> device-private buffer *could* cause the device to lose coherency with
> itself if the memory underlying the RMR may have allocated into system
> caches. Note that the expected use for non-remappable RMRs is the device
> holding some sort of long-lived private data in system RAM - the MSI
> doorbell trick is far more of a niche hack really.
>
> At the very least I think we need to refer to the device's memory access
> properties here.
>
> Jon, Laurentiu - how do RMRs correspond to the EFI memory map on your
> firmware? I'm starting to think that as long as the underlying memory is
> described appropriately there then we should be able to infer correct
> attributes from the EFI memory type and flags.

The devices are all cache coherent and marked as _CCA, 1.  The Memory
regions are in the virt table as ARM_MEMORY_REGION_ATTRIBUTE_DEVICE.

The current chicken and egg problem we have is that during the fsl-mc-bus
initialization we call

error = acpi_dma_configure_id(>dev, DEV_DMA_COHERENT,
  _stream_id);

which gets deferred because the SMMU has not been initialized yet. Then we
initialize the RMR tables but there is no device reference there to be able to

Re: [PATCH v7 2/9] ACPI/IORT: Add support for RMR node parsing

2021-09-06 Thread Robin Murphy

On 2021-08-05 17:03, Lorenzo Pieralisi wrote:

On Thu, Aug 05, 2021 at 09:07:17AM +0100, Shameer Kolothum wrote:

[...]


+static void __init iort_node_get_rmr_info(struct acpi_iort_node *iort_node)
+{
+   struct acpi_iort_node *smmu;
+   struct acpi_iort_rmr *rmr;
+   struct acpi_iort_rmr_desc *rmr_desc;
+   u32 map_count = iort_node->mapping_count;
+   u32 sid;
+   int i;
+
+   if (!iort_node->mapping_offset || map_count != 1) {
+   pr_err(FW_BUG "Invalid ID mapping, skipping RMR node %p\n",
+  iort_node);
+   return;
+   }
+
+   /* Retrieve associated smmu and stream id */
+   smmu = iort_node_get_id(iort_node, , 0);
+   if (!smmu) {
+   pr_err(FW_BUG "Invalid SMMU reference, skipping RMR node %p\n",
+  iort_node);
+   return;
+   }
+
+   /* Retrieve RMR data */
+   rmr = (struct acpi_iort_rmr *)iort_node->node_data;
+   if (!rmr->rmr_offset || !rmr->rmr_count) {
+   pr_err(FW_BUG "Invalid RMR descriptor array, skipping RMR node 
%p\n",
+  iort_node);
+   return;
+   }
+
+   rmr_desc = ACPI_ADD_PTR(struct acpi_iort_rmr_desc, iort_node,
+   rmr->rmr_offset);
+
+   iort_rmr_desc_check_overlap(rmr_desc, rmr->rmr_count);
+
+   for (i = 0; i < rmr->rmr_count; i++, rmr_desc++) {
+   struct iommu_resv_region *region;
+   enum iommu_resv_type type;
+   int prot = IOMMU_READ | IOMMU_WRITE;
+   u64 addr = rmr_desc->base_address, size = rmr_desc->length;
+
+   if (!IS_ALIGNED(addr, SZ_64K) || !IS_ALIGNED(size, SZ_64K)) {
+   /* PAGE align base addr and size */
+   addr &= PAGE_MASK;
+   size = PAGE_ALIGN(size + 
offset_in_page(rmr_desc->base_address));
+
+   pr_err(FW_BUG "RMR descriptor[0x%llx - 0x%llx] not aligned 
to 64K, continue with [0x%llx - 0x%llx]\n",
+  rmr_desc->base_address,
+  rmr_desc->base_address + rmr_desc->length - 1,
+  addr, addr + size - 1);
+   }
+   if (rmr->flags & IOMMU_RMR_REMAP_PERMITTED) {
+   type = IOMMU_RESV_DIRECT_RELAXABLE;
+   /*
+* Set IOMMU_CACHE as IOMMU_RESV_DIRECT_RELAXABLE is
+* normally used for allocated system memory that is
+* then used for device specific reserved regions.
+*/
+   prot |= IOMMU_CACHE;
+   } else {
+   type = IOMMU_RESV_DIRECT;
+   /*
+* Set IOMMU_MMIO as IOMMU_RESV_DIRECT is normally used
+* for device memory like MSI doorbell.
+*/
+   prot |= IOMMU_MMIO;
+   }


On the prot value assignment based on the remapping flag, I'd like to
hear Robin/Joerg's opinion, I'd avoid being in a situation where
"normally" this would work but then we have to quirk it.

Is this a valid assumption _always_ ?


No. Certainly applying IOMMU_CACHE without reference to the device's 
_CCA attribute or how CPUs may be accessing a shared buffer could lead 
to a loss of coherency. At worst, applying IOMMU_MMIO to a 
device-private buffer *could* cause the device to lose coherency with 
itself if the memory underlying the RMR may have allocated into system 
caches. Note that the expected use for non-remappable RMRs is the device 
holding some sort of long-lived private data in system RAM - the MSI 
doorbell trick is far more of a niche hack really.


At the very least I think we need to refer to the device's memory access 
properties here.


Jon, Laurentiu - how do RMRs correspond to the EFI memory map on your 
firmware? I'm starting to think that as long as the underlying memory is 
described appropriately there then we should be able to infer correct 
attributes from the EFI memory type and flags.


Robin.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/ipmmu-vmsa: Add support for r8a779a0

2021-09-06 Thread Geert Uytterhoeven
Hi Shimoda-san,

On Wed, Sep 1, 2021 at 12:27 PM Yoshihiro Shimoda
 wrote:
> Add support for r8a779a0 (R-Car V3U). The IPMMU hardware design
> of this SoC differs than others. So, add a new ipmmu_features for it.
>
> Signed-off-by: Yoshihiro Shimoda 

> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c

> @@ -922,6 +922,20 @@ static const struct ipmmu_features 
> ipmmu_features_rcar_gen3 = {
> .utlb_offset_base = 0,
>  };
>
> +static const struct ipmmu_features ipmmu_features_r8a779a0 = {
> +   .use_ns_alias_offset = false,
> +   .has_cache_leaf_nodes = true,
> +   .number_of_contexts = 8,

Shouldn't this be 16?
Or do you plan to add support for more than 8 contexts later, as that
would require increasing IPMMU_CTX_MAX, and updating ipmmu_ctx_reg()
to handle the second bank of 8 contexts?

Regardless, I assume this will still work when when limiting to 8
contexts, so
Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] dt-bindings: iommu: renesas, ipmmu-vmsa: add r8a779a0 support

2021-09-06 Thread Geert Uytterhoeven
On Wed, Sep 1, 2021 at 12:27 PM Yoshihiro Shimoda
 wrote:
> Add support for r8a779a0 (R-Car V3U).
>
> Signed-off-by: Yoshihiro Shimoda 

Reviewed-by: Geert Uytterhoeven 

Gr{oetje,eeting}s,

Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops

2021-09-06 Thread Yongji Xie
On Mon, Sep 6, 2021 at 6:43 PM Michael S. Tsirkin  wrote:
>
> On Mon, Sep 06, 2021 at 04:45:55PM +0800, Yongji Xie wrote:
> > On Mon, Sep 6, 2021 at 4:01 PM Michael S. Tsirkin  wrote:
> > >
> > > On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote:
> > > > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin  
> > > > wrote:
> > > > >
> > > > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote:
> > > > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote:
> > > > > > > > This adds a new callback to support device specific reset
> > > > > > > > behavior. The vdpa bus driver will call the reset function
> > > > > > > > instead of setting status to zero during resetting.
> > > > > > > >
> > > > > > > > Signed-off-by: Xie Yongji 
> > > > > > >
> > > > > > >
> > > > > > > This does gloss over a significant change though:
> > > > > > >
> > > > > > >
> > > > > > > > ---
> > > > > > > > @@ -348,12 +352,12 @@ static inline struct device 
> > > > > > > > *vdpa_get_dma_dev(struct vdpa_device *vdev)
> > > > > > > >   return vdev->dma_dev;
> > > > > > > >  }
> > > > > > > >
> > > > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev)
> > > > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev)
> > > > > > > >  {
> > > > > > > >   const struct vdpa_config_ops *ops = vdev->config;
> > > > > > > >
> > > > > > > >   vdev->features_valid = false;
> > > > > > > > - ops->set_status(vdev, 0);
> > > > > > > > + return ops->reset(vdev);
> > > > > > > >  }
> > > > > > > >
> > > > > > > >  static inline int vdpa_set_features(struct vdpa_device *vdev, 
> > > > > > > > u64 features)
> > > > > > >
> > > > > > >
> > > > > > > Unfortunately this breaks virtio_vdpa:
> > > > > > >
> > > > > > >
> > > > > > > static void virtio_vdpa_reset(struct virtio_device *vdev)
> > > > > > > {
> > > > > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> > > > > > >
> > > > > > > vdpa_reset(vdpa);
> > > > > > > }
> > > > > > >
> > > > > > >
> > > > > > > and there's no easy way to fix this, kernel can't recover
> > > > > > > from a reset failure e.g. during driver unbind.
> > > > > > >
> > > > > >
> > > > > > Yes, but it should be safe with the protection of software IOTLB 
> > > > > > even
> > > > > > if the reset() fails during driver unbind.
> > > > > >
> > > > > > Thanks,
> > > > > > Yongji
> > > > >
> > > > > Hmm. I don't see it.
> > > > > What exactly will happen? What prevents device from poking at
> > > > > memory after reset? Note that dma unmap in e.g. del_vqs happens
> > > > > too late.
> > > >
> > > > But I didn't see any problems with touching the memory for virtqueues.
> > >
> > > Drivers make the assumption that after reset returns no new
> > > buffers will be consumed. For example a bunch of drivers
> > > call virtqueue_detach_unused_buf.
> >
> > I'm not sure if I get your point. But it looks like
> > virtqueue_detach_unused_buf() will check the driver's metadata first
> > rather than read the memory from virtqueue.
> >
> > > I can't say whether block makes this assumption anywhere.
> > > Needs careful auditing.
> > >
> > > > The memory should not be freed after dma unmap?
> > >
> > > But unmap does not happen until after the reset.
> > >
> >
> > I mean the memory is totally allocated and controlled by the VDUSE
> > driver. The VDUSE driver will not return them to the buddy system
> > unless userspace unmap it.
>
> Right. But what stops VDUSE from poking at memory after
> reset failed?
>

Only itself. But in normal cases, userspace would not poke at the
memory since there is no available data after reset.

And it makes me think whether it's better to disallow returning errors
from userspace for the reset message. Then the only case that leads to
reset failure is the request timeout, which will mark the device
broken now.

>
> > >
> > > > And the memory for the bounce buffer should also be safe to be
> > > > accessed by userspace in this case.
> > > >
> > > > > And what about e.g. interrupts?
> > > > > E.g. we have this:
> > > > >
> > > > > /* Virtqueues are stopped, nothing can use vblk->vdev 
> > > > > anymore. */
> > > > > vblk->vdev = NULL;
> > > > >
> > > > > and this is no longer true at this point.
> > > > >
> > > >
> > > > You're right. But I didn't see where the interrupt handler will use
> > > > the vblk->vdev.
> > >
> > > static void virtblk_done(struct virtqueue *vq)
> > > {
> > > struct virtio_blk *vblk = vq->vdev->priv;
> > >
> > > vq->vdev is the same as vblk->vdev.
> > >
> >
> > We will test the vq->ready (will be set to false in del_vqs()) before
> > injecting an interrupt in the VDUSE driver. So it should be OK?
>
> Maybe not ...  It's not designed for such asynchronous access, so e.g.
> there's no locking or memory ordering around accesses.
>

Yes, so we still need the below fix.

>
> > >
> > > > So it seems to be 

Re: [git pull] IOMMU Updates for Linux v5.15

2021-09-06 Thread Robin Murphy

On 2021-09-03 22:44, Joerg Roedel wrote:

On Fri, Sep 03, 2021 at 11:43:31AM -0700, Linus Torvalds wrote:

   choice
 prompt "IOMMU default domain type"
 depends on IOMMU_API
 default IOMMU_DEFAULT_DMA_LAZY if AMD_IOMMU || INTEL_IOMMU
 default IOMMU_DEFAULT_DMA_STRICT


Huh, yeah, that is bogus. Seems like I overlooked that part of the
patch-set because I was so amazed by the simplifications and cleanups in
the rest of it.


Mad as it looks, this does in fact capture the existing behaviour. What 
we've consolidated here was previously a weird mix of driver- and 
subsystem-level controls, and it is specifically those two drivers which 
have a long-standing expectation of using lazy behaviour by default.



See what I'm saying? Making the default be based on some random "this
driver is enabled" when it can then affect *other* drivers that are
also enabled and not part of the decision seems to be a fundamental
confusion about what is going on, when it's not at all clear which
driver will actually be IN USE.


The Kconfig option itself was actually my suggestion, but how the
default value is chosen certainly needs improvement. I will sort this
out with the people involved.


IOW, the fix might be to just say "the default is always lazy".

Or the fix might be something that is global to a configuration and
doesn't rely on which iommu is in use (eg "on x86, the default is
always LAZY")


We could certainly express it as "default IOMMU_DEFAULT_DMA_LAZY if X86" 
if people would prefer - virtio-iommu doesn't support lazy mode either 
way, so the end result will still be equivalent.


Robin.


Or the fix is to make that 'iommu_dma_strict' variable - and the
default value for it - be a per-IOMMU thing rather than be a global.


My preference would be to make 'lazy' or 'strict' the default for all,
but the ARM folks might disagree. On the other side it also doesn't make
sense to let IOMMU drivers override the users Kconfig choice at runtime.
We will discuss this and come up with something better. >
Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops

2021-09-06 Thread Michael S. Tsirkin
On Mon, Sep 06, 2021 at 04:45:55PM +0800, Yongji Xie wrote:
> On Mon, Sep 6, 2021 at 4:01 PM Michael S. Tsirkin  wrote:
> >
> > On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote:
> > > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin  wrote:
> > > >
> > > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote:
> > > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin  
> > > > > wrote:
> > > > > >
> > > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote:
> > > > > > > This adds a new callback to support device specific reset
> > > > > > > behavior. The vdpa bus driver will call the reset function
> > > > > > > instead of setting status to zero during resetting.
> > > > > > >
> > > > > > > Signed-off-by: Xie Yongji 
> > > > > >
> > > > > >
> > > > > > This does gloss over a significant change though:
> > > > > >
> > > > > >
> > > > > > > ---
> > > > > > > @@ -348,12 +352,12 @@ static inline struct device 
> > > > > > > *vdpa_get_dma_dev(struct vdpa_device *vdev)
> > > > > > >   return vdev->dma_dev;
> > > > > > >  }
> > > > > > >
> > > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev)
> > > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev)
> > > > > > >  {
> > > > > > >   const struct vdpa_config_ops *ops = vdev->config;
> > > > > > >
> > > > > > >   vdev->features_valid = false;
> > > > > > > - ops->set_status(vdev, 0);
> > > > > > > + return ops->reset(vdev);
> > > > > > >  }
> > > > > > >
> > > > > > >  static inline int vdpa_set_features(struct vdpa_device *vdev, 
> > > > > > > u64 features)
> > > > > >
> > > > > >
> > > > > > Unfortunately this breaks virtio_vdpa:
> > > > > >
> > > > > >
> > > > > > static void virtio_vdpa_reset(struct virtio_device *vdev)
> > > > > > {
> > > > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> > > > > >
> > > > > > vdpa_reset(vdpa);
> > > > > > }
> > > > > >
> > > > > >
> > > > > > and there's no easy way to fix this, kernel can't recover
> > > > > > from a reset failure e.g. during driver unbind.
> > > > > >
> > > > >
> > > > > Yes, but it should be safe with the protection of software IOTLB even
> > > > > if the reset() fails during driver unbind.
> > > > >
> > > > > Thanks,
> > > > > Yongji
> > > >
> > > > Hmm. I don't see it.
> > > > What exactly will happen? What prevents device from poking at
> > > > memory after reset? Note that dma unmap in e.g. del_vqs happens
> > > > too late.
> > >
> > > But I didn't see any problems with touching the memory for virtqueues.
> >
> > Drivers make the assumption that after reset returns no new
> > buffers will be consumed. For example a bunch of drivers
> > call virtqueue_detach_unused_buf.
> 
> I'm not sure if I get your point. But it looks like
> virtqueue_detach_unused_buf() will check the driver's metadata first
> rather than read the memory from virtqueue.
> 
> > I can't say whether block makes this assumption anywhere.
> > Needs careful auditing.
> >
> > > The memory should not be freed after dma unmap?
> >
> > But unmap does not happen until after the reset.
> >
> 
> I mean the memory is totally allocated and controlled by the VDUSE
> driver. The VDUSE driver will not return them to the buddy system
> unless userspace unmap it.

Right. But what stops VDUSE from poking at memory after
reset failed?



> >
> > > And the memory for the bounce buffer should also be safe to be
> > > accessed by userspace in this case.
> > >
> > > > And what about e.g. interrupts?
> > > > E.g. we have this:
> > > >
> > > > /* Virtqueues are stopped, nothing can use vblk->vdev anymore. 
> > > > */
> > > > vblk->vdev = NULL;
> > > >
> > > > and this is no longer true at this point.
> > > >
> > >
> > > You're right. But I didn't see where the interrupt handler will use
> > > the vblk->vdev.
> >
> > static void virtblk_done(struct virtqueue *vq)
> > {
> > struct virtio_blk *vblk = vq->vdev->priv;
> >
> > vq->vdev is the same as vblk->vdev.
> >
> 
> We will test the vq->ready (will be set to false in del_vqs()) before
> injecting an interrupt in the VDUSE driver. So it should be OK?

Maybe not ...  It's not designed for such asynchronous access, so e.g.
there's no locking or memory ordering around accesses.


> >
> > > So it seems to be not too late to fix it:
> > >
> > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > index 5c25ff6483ad..ea41a7389a26 100644
> > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct
> > > vdpa_device *vdpa, unsigned int offset,
> > >  static int vduse_vdpa_reset(struct vdpa_device *vdpa)
> > >  {
> > > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > > +   int ret;
> > >
> > > -   if (vduse_dev_set_status(dev, 0))
> > > -   return -EIO;
> > > +   ret = vduse_dev_set_status(dev, 0);
> > >
> > > 

Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops

2021-09-06 Thread Yongji Xie
On Mon, Sep 6, 2021 at 4:01 PM Michael S. Tsirkin  wrote:
>
> On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote:
> > On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin  wrote:
> > >
> > > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote:
> > > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin  
> > > > wrote:
> > > > >
> > > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote:
> > > > > > This adds a new callback to support device specific reset
> > > > > > behavior. The vdpa bus driver will call the reset function
> > > > > > instead of setting status to zero during resetting.
> > > > > >
> > > > > > Signed-off-by: Xie Yongji 
> > > > >
> > > > >
> > > > > This does gloss over a significant change though:
> > > > >
> > > > >
> > > > > > ---
> > > > > > @@ -348,12 +352,12 @@ static inline struct device 
> > > > > > *vdpa_get_dma_dev(struct vdpa_device *vdev)
> > > > > >   return vdev->dma_dev;
> > > > > >  }
> > > > > >
> > > > > > -static inline void vdpa_reset(struct vdpa_device *vdev)
> > > > > > +static inline int vdpa_reset(struct vdpa_device *vdev)
> > > > > >  {
> > > > > >   const struct vdpa_config_ops *ops = vdev->config;
> > > > > >
> > > > > >   vdev->features_valid = false;
> > > > > > - ops->set_status(vdev, 0);
> > > > > > + return ops->reset(vdev);
> > > > > >  }
> > > > > >
> > > > > >  static inline int vdpa_set_features(struct vdpa_device *vdev, u64 
> > > > > > features)
> > > > >
> > > > >
> > > > > Unfortunately this breaks virtio_vdpa:
> > > > >
> > > > >
> > > > > static void virtio_vdpa_reset(struct virtio_device *vdev)
> > > > > {
> > > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> > > > >
> > > > > vdpa_reset(vdpa);
> > > > > }
> > > > >
> > > > >
> > > > > and there's no easy way to fix this, kernel can't recover
> > > > > from a reset failure e.g. during driver unbind.
> > > > >
> > > >
> > > > Yes, but it should be safe with the protection of software IOTLB even
> > > > if the reset() fails during driver unbind.
> > > >
> > > > Thanks,
> > > > Yongji
> > >
> > > Hmm. I don't see it.
> > > What exactly will happen? What prevents device from poking at
> > > memory after reset? Note that dma unmap in e.g. del_vqs happens
> > > too late.
> >
> > But I didn't see any problems with touching the memory for virtqueues.
>
> Drivers make the assumption that after reset returns no new
> buffers will be consumed. For example a bunch of drivers
> call virtqueue_detach_unused_buf.

I'm not sure if I get your point. But it looks like
virtqueue_detach_unused_buf() will check the driver's metadata first
rather than read the memory from virtqueue.

> I can't say whether block makes this assumption anywhere.
> Needs careful auditing.
>
> > The memory should not be freed after dma unmap?
>
> But unmap does not happen until after the reset.
>

I mean the memory is totally allocated and controlled by the VDUSE
driver. The VDUSE driver will not return them to the buddy system
unless userspace unmap it.

>
> > And the memory for the bounce buffer should also be safe to be
> > accessed by userspace in this case.
> >
> > > And what about e.g. interrupts?
> > > E.g. we have this:
> > >
> > > /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */
> > > vblk->vdev = NULL;
> > >
> > > and this is no longer true at this point.
> > >
> >
> > You're right. But I didn't see where the interrupt handler will use
> > the vblk->vdev.
>
> static void virtblk_done(struct virtqueue *vq)
> {
> struct virtio_blk *vblk = vq->vdev->priv;
>
> vq->vdev is the same as vblk->vdev.
>

We will test the vq->ready (will be set to false in del_vqs()) before
injecting an interrupt in the VDUSE driver. So it should be OK?

>
> > So it seems to be not too late to fix it:
> >
> > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c
> > b/drivers/vdpa/vdpa_user/vduse_dev.c
> > index 5c25ff6483ad..ea41a7389a26 100644
> > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct
> > vdpa_device *vdpa, unsigned int offset,
> >  static int vduse_vdpa_reset(struct vdpa_device *vdpa)
> >  {
> > struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> > +   int ret;
> >
> > -   if (vduse_dev_set_status(dev, 0))
> > -   return -EIO;
> > +   ret = vduse_dev_set_status(dev, 0);
> >
> > vduse_dev_reset(dev);
> >
> > -   return 0;
> > +   return ret;
> >  }
> >
> >  static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa)
> >
> > Thanks,
> > Yongji
>
> Needs some comments to explain why it's done like this.
>

This is used to make sure the userspace can't not inject the interrupt
any more after reset. The vduse_dev_reset() will clear the interrupt
callback and flush the irq kworker.

> BTW device is generally wedged at this point right?
> E.g. if reset during initialization fails, userspace
> will still get 

Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops

2021-09-06 Thread Michael S. Tsirkin
On Mon, Sep 06, 2021 at 03:06:44PM +0800, Yongji Xie wrote:
> On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin  wrote:
> >
> > On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote:
> > > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin  wrote:
> > > >
> > > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote:
> > > > > This adds a new callback to support device specific reset
> > > > > behavior. The vdpa bus driver will call the reset function
> > > > > instead of setting status to zero during resetting.
> > > > >
> > > > > Signed-off-by: Xie Yongji 
> > > >
> > > >
> > > > This does gloss over a significant change though:
> > > >
> > > >
> > > > > ---
> > > > > @@ -348,12 +352,12 @@ static inline struct device 
> > > > > *vdpa_get_dma_dev(struct vdpa_device *vdev)
> > > > >   return vdev->dma_dev;
> > > > >  }
> > > > >
> > > > > -static inline void vdpa_reset(struct vdpa_device *vdev)
> > > > > +static inline int vdpa_reset(struct vdpa_device *vdev)
> > > > >  {
> > > > >   const struct vdpa_config_ops *ops = vdev->config;
> > > > >
> > > > >   vdev->features_valid = false;
> > > > > - ops->set_status(vdev, 0);
> > > > > + return ops->reset(vdev);
> > > > >  }
> > > > >
> > > > >  static inline int vdpa_set_features(struct vdpa_device *vdev, u64 
> > > > > features)
> > > >
> > > >
> > > > Unfortunately this breaks virtio_vdpa:
> > > >
> > > >
> > > > static void virtio_vdpa_reset(struct virtio_device *vdev)
> > > > {
> > > > struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> > > >
> > > > vdpa_reset(vdpa);
> > > > }
> > > >
> > > >
> > > > and there's no easy way to fix this, kernel can't recover
> > > > from a reset failure e.g. during driver unbind.
> > > >
> > >
> > > Yes, but it should be safe with the protection of software IOTLB even
> > > if the reset() fails during driver unbind.
> > >
> > > Thanks,
> > > Yongji
> >
> > Hmm. I don't see it.
> > What exactly will happen? What prevents device from poking at
> > memory after reset? Note that dma unmap in e.g. del_vqs happens
> > too late.
> 
> But I didn't see any problems with touching the memory for virtqueues.

Drivers make the assumption that after reset returns no new
buffers will be consumed. For example a bunch of drivers
call virtqueue_detach_unused_buf.
I can't say whether block makes this assumption anywhere.
Needs careful auditing.

> The memory should not be freed after dma unmap?

But unmap does not happen until after the reset.


> And the memory for the bounce buffer should also be safe to be
> accessed by userspace in this case.
> 
> > And what about e.g. interrupts?
> > E.g. we have this:
> >
> > /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */
> > vblk->vdev = NULL;
> >
> > and this is no longer true at this point.
> >
> 
> You're right. But I didn't see where the interrupt handler will use
> the vblk->vdev.

static void virtblk_done(struct virtqueue *vq)
{
struct virtio_blk *vblk = vq->vdev->priv;

vq->vdev is the same as vblk->vdev.


> So it seems to be not too late to fix it:
> 
> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c
> b/drivers/vdpa/vdpa_user/vduse_dev.c
> index 5c25ff6483ad..ea41a7389a26 100644
> --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> @@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct
> vdpa_device *vdpa, unsigned int offset,
>  static int vduse_vdpa_reset(struct vdpa_device *vdpa)
>  {
> struct vduse_dev *dev = vdpa_to_vduse(vdpa);
> +   int ret;
> 
> -   if (vduse_dev_set_status(dev, 0))
> -   return -EIO;
> +   ret = vduse_dev_set_status(dev, 0);
> 
> vduse_dev_reset(dev);
> 
> -   return 0;
> +   return ret;
>  }
> 
>  static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa)
> 
> Thanks,
> Yongji

Needs some comments to explain why it's done like this.

BTW device is generally wedged at this point right?
E.g. if reset during initialization fails, userspace
will still get the reset at some later point and be
confused ...

-- 
MST

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops

2021-09-06 Thread Yongji Xie
On Mon, Sep 6, 2021 at 2:37 PM Michael S. Tsirkin  wrote:
>
> On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote:
> > On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin  wrote:
> > >
> > > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote:
> > > > This adds a new callback to support device specific reset
> > > > behavior. The vdpa bus driver will call the reset function
> > > > instead of setting status to zero during resetting.
> > > >
> > > > Signed-off-by: Xie Yongji 
> > >
> > >
> > > This does gloss over a significant change though:
> > >
> > >
> > > > ---
> > > > @@ -348,12 +352,12 @@ static inline struct device 
> > > > *vdpa_get_dma_dev(struct vdpa_device *vdev)
> > > >   return vdev->dma_dev;
> > > >  }
> > > >
> > > > -static inline void vdpa_reset(struct vdpa_device *vdev)
> > > > +static inline int vdpa_reset(struct vdpa_device *vdev)
> > > >  {
> > > >   const struct vdpa_config_ops *ops = vdev->config;
> > > >
> > > >   vdev->features_valid = false;
> > > > - ops->set_status(vdev, 0);
> > > > + return ops->reset(vdev);
> > > >  }
> > > >
> > > >  static inline int vdpa_set_features(struct vdpa_device *vdev, u64 
> > > > features)
> > >
> > >
> > > Unfortunately this breaks virtio_vdpa:
> > >
> > >
> > > static void virtio_vdpa_reset(struct virtio_device *vdev)
> > > {
> > > struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> > >
> > > vdpa_reset(vdpa);
> > > }
> > >
> > >
> > > and there's no easy way to fix this, kernel can't recover
> > > from a reset failure e.g. during driver unbind.
> > >
> >
> > Yes, but it should be safe with the protection of software IOTLB even
> > if the reset() fails during driver unbind.
> >
> > Thanks,
> > Yongji
>
> Hmm. I don't see it.
> What exactly will happen? What prevents device from poking at
> memory after reset? Note that dma unmap in e.g. del_vqs happens
> too late.

But I didn't see any problems with touching the memory for virtqueues.
The memory should not be freed after dma unmap?

And the memory for the bounce buffer should also be safe to be
accessed by userspace in this case.

> And what about e.g. interrupts?
> E.g. we have this:
>
> /* Virtqueues are stopped, nothing can use vblk->vdev anymore. */
> vblk->vdev = NULL;
>
> and this is no longer true at this point.
>

You're right. But I didn't see where the interrupt handler will use
the vblk->vdev.

So it seems to be not too late to fix it:

diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c
b/drivers/vdpa/vdpa_user/vduse_dev.c
index 5c25ff6483ad..ea41a7389a26 100644
--- a/drivers/vdpa/vdpa_user/vduse_dev.c
+++ b/drivers/vdpa/vdpa_user/vduse_dev.c
@@ -665,13 +665,13 @@ static void vduse_vdpa_set_config(struct
vdpa_device *vdpa, unsigned int offset,
 static int vduse_vdpa_reset(struct vdpa_device *vdpa)
 {
struct vduse_dev *dev = vdpa_to_vduse(vdpa);
+   int ret;

-   if (vduse_dev_set_status(dev, 0))
-   return -EIO;
+   ret = vduse_dev_set_status(dev, 0);

vduse_dev_reset(dev);

-   return 0;
+   return ret;
 }

 static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa)

Thanks,
Yongji
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops

2021-09-06 Thread Michael S. Tsirkin
On Mon, Sep 06, 2021 at 02:09:25PM +0800, Yongji Xie wrote:
> On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin  wrote:
> >
> > On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote:
> > > This adds a new callback to support device specific reset
> > > behavior. The vdpa bus driver will call the reset function
> > > instead of setting status to zero during resetting.
> > >
> > > Signed-off-by: Xie Yongji 
> >
> >
> > This does gloss over a significant change though:
> >
> >
> > > ---
> > > @@ -348,12 +352,12 @@ static inline struct device 
> > > *vdpa_get_dma_dev(struct vdpa_device *vdev)
> > >   return vdev->dma_dev;
> > >  }
> > >
> > > -static inline void vdpa_reset(struct vdpa_device *vdev)
> > > +static inline int vdpa_reset(struct vdpa_device *vdev)
> > >  {
> > >   const struct vdpa_config_ops *ops = vdev->config;
> > >
> > >   vdev->features_valid = false;
> > > - ops->set_status(vdev, 0);
> > > + return ops->reset(vdev);
> > >  }
> > >
> > >  static inline int vdpa_set_features(struct vdpa_device *vdev, u64 
> > > features)
> >
> >
> > Unfortunately this breaks virtio_vdpa:
> >
> >
> > static void virtio_vdpa_reset(struct virtio_device *vdev)
> > {
> > struct vdpa_device *vdpa = vd_get_vdpa(vdev);
> >
> > vdpa_reset(vdpa);
> > }
> >
> >
> > and there's no easy way to fix this, kernel can't recover
> > from a reset failure e.g. during driver unbind.
> >
> 
> Yes, but it should be safe with the protection of software IOTLB even
> if the reset() fails during driver unbind.
> 
> Thanks,
> Yongji

Hmm. I don't see it.
What exactly will happen? What prevents device from poking at
memory after reset? Note that dma unmap in e.g. del_vqs happens
too late.  And what about e.g. interrupts?
E.g. we have this:

/* Virtqueues are stopped, nothing can use vblk->vdev anymore. */
vblk->vdev = NULL;

and this is no longer true at this point.


-- 
MST

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v13 05/13] vdpa: Add reset callback in vdpa_config_ops

2021-09-06 Thread Yongji Xie
On Mon, Sep 6, 2021 at 1:56 PM Michael S. Tsirkin  wrote:
>
> On Tue, Aug 31, 2021 at 06:36:26PM +0800, Xie Yongji wrote:
> > This adds a new callback to support device specific reset
> > behavior. The vdpa bus driver will call the reset function
> > instead of setting status to zero during resetting.
> >
> > Signed-off-by: Xie Yongji 
>
>
> This does gloss over a significant change though:
>
>
> > ---
> > @@ -348,12 +352,12 @@ static inline struct device *vdpa_get_dma_dev(struct 
> > vdpa_device *vdev)
> >   return vdev->dma_dev;
> >  }
> >
> > -static inline void vdpa_reset(struct vdpa_device *vdev)
> > +static inline int vdpa_reset(struct vdpa_device *vdev)
> >  {
> >   const struct vdpa_config_ops *ops = vdev->config;
> >
> >   vdev->features_valid = false;
> > - ops->set_status(vdev, 0);
> > + return ops->reset(vdev);
> >  }
> >
> >  static inline int vdpa_set_features(struct vdpa_device *vdev, u64 features)
>
>
> Unfortunately this breaks virtio_vdpa:
>
>
> static void virtio_vdpa_reset(struct virtio_device *vdev)
> {
> struct vdpa_device *vdpa = vd_get_vdpa(vdev);
>
> vdpa_reset(vdpa);
> }
>
>
> and there's no easy way to fix this, kernel can't recover
> from a reset failure e.g. during driver unbind.
>

Yes, but it should be safe with the protection of software IOTLB even
if the reset() fails during driver unbind.

Thanks,
Yongji
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu