Hi Robin, On 04/22/2016 02:36 PM, Robin Murphy wrote: > On 20/04/16 17:14, Eric Auger wrote: >> Hi Robin, >> On 04/20/2016 02:55 PM, Robin Murphy wrote: >>> On 19/04/16 17:56, Eric Auger wrote: >>>> This patch introduces some new fields in the iommu_domain struct, >>>> dedicated to reserved iova management. >>>> >>>> In a similar way as DMA mapping IOVA window, we need to store >>>> information related to a reserved IOVA window. >>>> >>>> The reserved_iova_cookie will store the reserved iova_domain >>>> handle. An RB tree indexed by physical address is introduced to >>>> store the host physical addresses bound to reserved IOVAs. >>>> >>>> Those physical addresses will correspond to MSI frame base >>>> addresses, also referred to as doorbells. Their number should be >>>> quite limited per domain. >>>> >>>> Also a spin_lock is introduced to protect accesses to the iova_domain >>>> and RB tree. The choice of a spin_lock is driven by the fact the RB >>>> tree will need to be accessed in MSI controller code not allowed to >>>> sleep. >>>> >>>> Signed-off-by: Eric Auger <eric.au...@linaro.org> >>>> >>>> --- >>>> v5 -> v6: >>>> - initialize reserved_binding_list >>>> - use a spinlock instead of a mutex >>>> --- >>>> drivers/iommu/iommu.c | 2 ++ >>>> include/linux/iommu.h | 6 ++++++ >>>> 2 files changed, 8 insertions(+) >>>> >>>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c >>>> index b9df141..f70ef3b 100644 >>>> --- a/drivers/iommu/iommu.c >>>> +++ b/drivers/iommu/iommu.c >>>> @@ -1073,6 +1073,8 @@ static struct iommu_domain >>>> *__iommu_domain_alloc(struct bus_type *bus, >>>> >>>> domain->ops = bus->iommu_ops; >>>> domain->type = type; >>>> + spin_lock_init(&domain->reserved_lock); >>>> + domain->reserved_binding_list = RB_ROOT; >>>> >>>> return domain; >>>> } >>>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h >>>> index b3e8c5b..60999db 100644 >>>> --- a/include/linux/iommu.h >>>> +++ b/include/linux/iommu.h >>>> @@ -24,6 +24,7 @@ >>>> #include <linux/of.h> >>>> #include <linux/types.h> >>>> #include <linux/scatterlist.h> >>>> +#include <linux/spinlock.h> >>>> #include <trace/events/iommu.h> >>>> >>>> #define IOMMU_READ (1 << 0) >>>> @@ -83,6 +84,11 @@ struct iommu_domain { >>>> void *handler_token; >>>> struct iommu_domain_geometry geometry; >>>> void *iova_cookie; >>>> + void *reserved_iova_cookie; >>> >>> Why exactly do we need this? From your description, it's for the user of >>> the domain to keep track of IOVA allocations in, but then that's >>> precisely what the iova_cookie exists for. >> >> I was not sure whether both APIs could not be used concurrently, hence a >> separate cookie. If we only consider MSI mapping use case I guess we are >> either with a DMA domain or with a domain for VFIO and I would agree >> with you, ie. we can reuse the same cookie. > > Unless somebody cooks up some paravirtualised monstrosity where the > guest driver somehow uses the host kernel's DMA mapping ops (thankfully, > I'm not sure how that would even be possible), then they should always > be mutually exclusive.
OK thanks > > (That said, I should probably add a sanity check to > iommu_dma_put_cookie() to ensure it only touches the cookies of > IOMMU_DOMAIN_DMA domains...) > >>>> + /* rb tree indexed by PA, for reserved bindings only */ >>>> + struct rb_root reserved_binding_list; >>> >>> Nit: that's more puzzling than helpful - "reserved binding" is >>> particularly vague and nondescript, and makes me think of anything but >>> MSI descriptors. >> my heart is torn between advised genericity and MSI use case. My natural >> short-sighted inclination would head me for an MSI mapping dedicated API >> but I am following advices. As discussed with Alex there are >> implementation details pretty related to MSI problematics I think (the >> fact we store the "bindings" in an rb-tree/list, locking) >> >> If Marc & Alex I can retarget this API to be less generic. >> >> Plus it's called a list but isn't a list (that said, >>> given that we'd typically only expect a handful of entries, and lookups >>> are hardly going to be a performance-critical bottleneck, would a simple >>> list not suffice?) >> I fully agree on that point. An rb-tree is overkill today for MSI use >> case. Again if we were to use this API for anything else, this may >> change the decision. But sure we can refactor afterwards upon needs. TBH >> the rb-tree is inherited from vfio_iommu_type1 dma tree where that code >> was originally located. > > Thinking some more, how feasible would it be to handle the IOVA > management aspect within the existing tree, i.e. extend struct vfio_dma > so an entry can represent different types of thing - DMA pages, MSI > pages, arbitrary reservations - and link to more implementation-specific > data (e.g. a refcounted MSI descriptor stored elsewhere in the domain) > as necessary? it is feasible and was approximately done there at the early ages of the series: https://lkml.org/lkml/2016/1/28/803 & https://lists.cs.columbia.edu/pipermail/kvmarm/2015-September/016607.html Then with the intent of doing something reusable the trend was to put it in the iommu instead of vfio_iommu_typeX. I am currently trying to make the "msi-iommu api" implemented upon the dma-iommu api based on the simplifications that we discussed: - reuse iova_cookie for iova domain - add an opaque msi_cookie that hides the msi doorbell list + its spinlock - simplify locking by making sure the msi domain cannot disappear before the iommu domain destruction If you agree I would suggest to wait and see the outcome of this new design and make a shared decision based on that code? Should be available next week. Best Regards Eric > > Robin. > >>> >>>> + /* protects reserved cookie and rbtree manipulation */ >>>> + spinlock_t reserved_lock; >>> >>> A cookie is an opaque structure, so any locking it needs would normally >>> be hidden within. If on the other hand it's not meant to be opaque at >>> this level, then it should probably be something more specific than a >>> void * (if at all, as above). >> agreed >> >> Thanks >> >> Eric >>> >>> Robin. >>> >>>> }; >>>> >>>> enum iommu_cap { >>>> >>> >> >