On Thu, 8 Nov 2018 11:30:04 +0000
James Sewart <jamessew...@arista.com> wrote:

> Hey,
> 
> > On 8 Nov 2018, at 01:42, Lu Baolu <baolu...@linux.intel.com> wrote:
> > 
> > Hi,
> > 
> > On 11/8/18 1:55 AM, James Sewart wrote:  
> >> Hey,  
> >>> On 7 Nov 2018, at 02:10, Lu Baolu <baolu...@linux.intel.com>
> >>> wrote:
> >>> 
> >>> Hi,
> >>> 
> >>> On 11/6/18 6:40 PM, James Sewart wrote:  
> >>>> Hey Lu,
> >>>> Would you be able to go into more detail about the issues with
> >>>> allowing IOMMU_DOMAIN_DMA to be allocated via domain_alloc?  
> >>> 
> >>> This door is closed because intel iommu driver does everything for
> >>> IOMMU_DOMAIN_DMA: allocating a domain and setup the context
> >>> entries for the domain.  
> >> As far as I can tell, attach_device in the intel driver will handle
> >> cleaning up any old domain context mapping and ensure the new
> >> domain is mapped with calls to dmar_remove_one_dev_info and
> >> domain_add_dev_info.  
> > 
> > That's only for domains of IOMMU_DOMAIN_UNMANAGED, right?  
> 
> attach_device has logic for cleaning up domains that are allocated by
> the intel driver.
> 
>               old_domain = find_domain(dev);
>               if (old_domain) {
>                       rcu_read_lock();
>                       dmar_remove_one_dev_info(old_domain, dev);
>                       rcu_read_unlock();
> 
>                       if (!domain_type_is_vm_or_si(old_domain) &&
>                            list_empty(&old_domain->devices))
>                               domain_exit(old_domain);
>               }
> 
> This is checking the type of the old domain only, freeing it if is
> not attached to any devices. Looking at this now, maybe the solution
> would be to distinguish between internally allocated dma domains and
> dma domains allocated via the external api, so we avoid freeing a
> domain a driver has reference to.
> 
I think this can also be solved by adopting default domain and moving
internal domain to the generic code at group level. i.e. when device is
added to a group, the group gets a default domain (also set as current
domain) for DMA APIs. When device is attached to an unmanaged domain
(VFIO container and IOMMU API), it will switch the current domain to
the new domain. Old domain will be inactive but not exited.
> >>> 
> >>> Why do we want to open this door? Probably we want the generic
> >>> iommu layer to handle these things (it's called default domain).  
> >> I’d like to allocate a domain and attach it to multiple devices in
> >> a group/multiple groups so that they share address translation,
> >> but still allow drivers for devices in those groups to use the
> >> dma_map_ops api.  
> > 
> > Just out of curiosity, why do you want to share a single domain
> > across multiple groups? By default, the groups and DMA domains are
> > normally 1-1 mapped, right?  
> 
> Currently we see each device in a group with their own domain. 
> find_or_alloc_domain looks at dma aliases to determine who shares
> domains whereas pci_device_group in the generic iommu code determines
> groups using a few other checks. We have observed internally that
> devices under a pcie switch will be put in the same group but they do
> not share a domain within that group.
> 
> Getting every device within a group to share a domain would get us
> 90% of the way to what we want. But we have some configurations where
> there exist devices put in other groups that we want to share
> translations with.
> 
This is indeed a huge disconnect between IOMMU API and VT-d handling
of the DMA API. IOMMU API operates at group granu (per group domain)
that is based on ACS whereas DMA API in VT-d has per device domain and
sharing is not based on ACS. So I agree you could have multiple domains
under the same group for DMA API use.

I am working on a patch to align the two, e.g. use per group
default domain for DMA APIs. My idea is as follows:
- support DOMAIN_DMA type in intel_iommu_domain_alloc()
- add device to group will attach to the default DMA domain
        - have to inherit any RMRR or identity mapping holes set up
          prior to IOMMU group setup.
        - Or perhaps implement apply_resv_region() in VT-d driver and
          move RMRR setup here.
- DMA map API, will use per group default domain instead of per device
  private domain

The change is rather pervasive so i am trying to set a balance for
short term functionality and complete clean up. Any ideas welcome.

> >   
> >>> So we can't just open the door but not cleanup the things right?  
> >> A user of domain_alloc and attach_device is responsible for
> >> detaching a domain if it is no longer needed and calling
> >> domain_free.  
> > 
> > Currently DMA API calls get_valid_domain_for_dev() to retrieve a DMA
> > domain. If the domain has already been allocated, return directly.
> > Otherwise, allocate and initialize a new one for the device. Let's
> > call domains allocated by get_valid_domain_for_dev() as "A".
> > 
> > If we open the door and allow another component to manage the DMA
> > domains through domain iommu_domain_alloc/free(). Let's call domains
> > allocated through iommu_domain_alloc() as "B".
> > 
> > So how can we sync between A and B?  
> 
> I’m not sure we need to sync them. Domain A would be replaced
> entirely for a certain device with domain B by attach_device. As we
> saw above domain A would be freed if there are no more devices
> attached to it. Domain B would be managed by the user of the iommu
> api. If a device is detached from domain B, the find_or_alloc_domain
> logic will take over once more and allocate a fresh domain.
> 
Agreed, it is more like switching domains when you switching APIs. But
the reserved region better be preserved.
> Cheers,
> James.
> 
> > 
> > Need to go through the code to find out more.
> > 
> > Best regards,
> > Lu Baolu  
> >> Cheers,
> >> James.  
> >>> 
> >>> I haven't spent time on details. So I cc'ed Jacob for corrections.
> >>> 
> >>> Best regards,
> >>> Lu Baolu
> >>>   
> >>>> Cheers,
> >>>> James.
> >>>> On Fri, Nov 2, 2018 at 2:43 AM Lu Baolu
> >>>> <baolu...@linux.intel.com> wrote:  
> >>>>> 
> >>>>> Hi,
> >>>>> 
> >>>>> On 10/30/18 10:18 PM, James Sewart via iommu wrote:  
> >>>>>> Hey,
> >>>>>> 
> >>>>>> I’ve been investigating the relationship between iommu groups
> >>>>>> and domains on our systems and have a few question. Why does
> >>>>>> the intel iommu code not allow allocating IOMMU_DOMAIN_DMA?
> >>>>>> Returning NULL when given this domain type has the side effect
> >>>>>> that the default_domain for an iommu group is not set, which,
> >>>>>> when using for e.g. dma_map_ops.map_page, means a domain is
> >>>>>> allocated per device.  
> >>>>> 
> >>>>> Intel vt-d driver doesn't implement the default domain and
> >>>>> allocates domain only on demanded. There are lots of things to
> >>>>> do before we allow iommu API to allocate domains other than
> >>>>> IOMMU_DOMAIN_UNMANAGED.
> >>>>> 
> >>>>> Best regards,
> >>>>> Lu Baolu
> >>>>>   
> >>>>>> 
> >>>>>> This seems to be the opposite behaviour to the AMD iommu code
> >>>>>> which supports allocating an IOMMU_DOMAIN_DMA and will only
> >>>>>> look to the iommu group if a domain is not attached to the
> >>>>>> device rather than allocating a new one. On AMD every device
> >>>>>> in an iommu group will share the same domain.
> >>>>>> 
> >>>>>> Appended is what I think a patch to implement domain_alloc for
> >>>>>> IOMMU_DOMAIN_DMA and also IOMMU_DOMAIN_IDENTITY would look
> >>>>>> like. Testing shows each device in a group will share a domain
> >>>>>> by default, it also allows allocating a new dma domain that
> >>>>>> can be successfully attached to a group with
> >>>>>> iommu_attach_group.
> >>>>>> 
> >>>>>> Looking for comment on why the behaviour is how it is
> >>>>>> currently and if there are any issues with the solution I’ve
> >>>>>> been testing.
> >>>>>> 
> >>>>>> Cheers,
> >>>>>> James.
> >>>>>> 
> >>>>>> 
> >>>>>> diff --git a/drivers/iommu/intel-iommu.c
> >>>>>> b/drivers/iommu/intel-iommu.c index bff2abd6..3a58389f 100644
> >>>>>> --- a/drivers/iommu/intel-iommu.c
> >>>>>> +++ b/drivers/iommu/intel-iommu.c
> >>>>>> @@ -5170,10 +5170,15 @@ static struct iommu_domain
> >>>>>> *intel_iommu_domain_alloc(unsigned type) struct dmar_domain
> >>>>>> *dmar_domain; struct iommu_domain *domain;
> >>>>>> 
> >>>>>> -     if (type != IOMMU_DOMAIN_UNMANAGED)
> >>>>>> +     if (type == IOMMU_DOMAIN_UNMANAGED)
> >>>>>> +             dmar_domain =
> >>>>>> alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
> >>>>>> +     else if(type == IOMMU_DOMAIN_DMA)
> >>>>>> +             dmar_domain = alloc_domain(0);
> >>>>>> +     else if(type == IOMMU_DOMAIN_IDENTITY)
> >>>>>> +             dmar_domain =
> >>>>>> alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
> >>>>>> +     else
> >>>>>>               return NULL;
> >>>>>> 
> >>>>>> -     dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
> >>>>>>       if (!dmar_domain) {
> >>>>>>               pr_err("Can't allocate dmar_domain\n");
> >>>>>>               return NULL;
> >>>>>> @@ -5186,9 +5191,12 @@ static struct iommu_domain
> >>>>>> *intel_iommu_domain_alloc(unsigned type)
> >>>>>> domain_update_iommu_cap(dmar_domain);
> >>>>>> 
> >>>>>>       domain = &dmar_domain->domain;
> >>>>>> -     domain->geometry.aperture_start = 0;
> >>>>>> -     domain->geometry.aperture_end   =
> >>>>>> __DOMAIN_MAX_ADDR(dmar_domain->gaw);
> >>>>>> -     domain->geometry.force_aperture = true;
> >>>>>> +
> >>>>>> +     if (type == IOMMU_DOMAIN_UNMANAGED) {
> >>>>>> +             domain->geometry.aperture_start = 0;
> >>>>>> +             domain->geometry.aperture_end   =
> >>>>>> __DOMAIN_MAX_ADDR(dmar_domain->gaw);
> >>>>>> +             domain->geometry.force_aperture = true;
> >>>>>> +     }
> >>>>>> 
> >>>>>>       return domain;
> >>>>>>   }
> >>>>>> _______________________________________________
> >>>>>> iommu mailing list
> >>>>>> iommu@lists.linux-foundation.org
> >>>>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> >>>>>>   
> 

[Jacob Pan]
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to