On Wed, May 04, 2022 at 11:32:51AM +0200, Jan Beulich wrote:
> On 03.05.2022 16:50, Jan Beulich wrote:
> > On 03.05.2022 15:00, Roger Pau Monné wrote:
> >> On Mon, Apr 25, 2022 at 10:34:23AM +0200, Jan Beulich wrote:
> >>> While already the case for PVH, there's no reason to treat PV
> >>> differently here, though of course the addresses get taken from another
> >>> source in this case. Except that, to match CPU side mappings, by default
> >>> we permit r/o ones. This then also means we now deal consistently with
> >>> IO-APICs whose MMIO is or is not covered by E820 reserved regions.
> >>>
> >>> Signed-off-by: Jan Beulich <jbeul...@suse.com>
> >>> ---
> >>> [integrated] v1: Integrate into series.
> >>> [standalone] v2: Keep IOMMU mappings in sync with CPU ones.
> >>>
> >>> --- a/xen/drivers/passthrough/x86/iommu.c
> >>> +++ b/xen/drivers/passthrough/x86/iommu.c
> >>> @@ -275,12 +275,12 @@ void iommu_identity_map_teardown(struct
> >>>      }
> >>>  }
> >>>  
> >>> -static bool __hwdom_init hwdom_iommu_map(const struct domain *d,
> >>> -                                         unsigned long pfn,
> >>> -                                         unsigned long max_pfn)
> >>> +static unsigned int __hwdom_init hwdom_iommu_map(const struct domain *d,
> >>> +                                                 unsigned long pfn,
> >>> +                                                 unsigned long max_pfn)
> >>>  {
> >>>      mfn_t mfn = _mfn(pfn);
> >>> -    unsigned int i, type;
> >>> +    unsigned int i, type, perms = IOMMUF_readable | IOMMUF_writable;
> >>>  
> >>>      /*
> >>>       * Set up 1:1 mapping for dom0. Default to include only conventional 
> >>> RAM
> >>> @@ -289,44 +289,60 @@ static bool __hwdom_init hwdom_iommu_map
> >>>       * that fall in unusable ranges for PV Dom0.
> >>>       */
> >>>      if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) )
> >>> -        return false;
> >>> +        return 0;
> >>>  
> >>>      switch ( type = page_get_ram_type(mfn) )
> >>>      {
> >>>      case RAM_TYPE_UNUSABLE:
> >>> -        return false;
> >>> +        return 0;
> >>>  
> >>>      case RAM_TYPE_CONVENTIONAL:
> >>>          if ( iommu_hwdom_strict )
> >>> -            return false;
> >>> +            return 0;
> >>>          break;
> >>>  
> >>>      default:
> >>>          if ( type & RAM_TYPE_RESERVED )
> >>>          {
> >>>              if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved )
> >>> -                return false;
> >>> +                perms = 0;
> >>>          }
> >>> -        else if ( is_hvm_domain(d) || !iommu_hwdom_inclusive || pfn > 
> >>> max_pfn )
> >>> -            return false;
> >>> +        else if ( is_hvm_domain(d) )
> >>> +            return 0;
> >>> +        else if ( !iommu_hwdom_inclusive || pfn > max_pfn )
> >>> +            perms = 0;
> >>>      }
> >>>  
> >>>      /* Check that it doesn't overlap with the Interrupt Address Range. */
> >>>      if ( pfn >= 0xfee00 && pfn <= 0xfeeff )
> >>> -        return false;
> >>> +        return 0;
> >>>      /* ... or the IO-APIC */
> >>> -    for ( i = 0; has_vioapic(d) && i < d->arch.hvm.nr_vioapics; i++ )
> >>> -        if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
> >>> -            return false;
> >>> +    if ( has_vioapic(d) )
> >>> +    {
> >>> +        for ( i = 0; i < d->arch.hvm.nr_vioapics; i++ )
> >>> +            if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) )
> >>> +                return 0;
> >>> +    }
> >>> +    else if ( is_pv_domain(d) )
> >>> +    {
> >>> +        /*
> >>> +         * Be consistent with CPU mappings: Dom0 is permitted to 
> >>> establish r/o
> >>> +         * ones there, so it should also have such established for 
> >>> IOMMUs.
> >>> +         */
> >>> +        for ( i = 0; i < nr_ioapics; i++ )
> >>> +            if ( pfn == PFN_DOWN(mp_ioapics[i].mpc_apicaddr) )
> >>> +                return rangeset_contains_singleton(mmio_ro_ranges, pfn)
> >>> +                       ? IOMMUF_readable : 0;
> >>
> >> If we really are after consistency with CPU side mappings, we should
> >> likely take the whole contents of mmio_ro_ranges and d->iomem_caps
> >> into account, not just the pages belonging to the IO-APIC?
> >>
> >> There could also be HPET pages mapped as RO for PV.
> > 
> > Hmm. This would be a yet bigger functional change, but indeed would further
> > improve consistency. But shouldn't we then also establish r/w mappings for
> > stuff in ->iomem_caps but not in mmio_ro_ranges? This would feel like going
> > too far ...
> 
> FTAOD I didn't mean to say that I think such mappings shouldn't be there;
> I have been of the opinion that e.g. I/O directly to/from the linear
> frame buffer of a graphics device should in principle be permitted. But
> which specific mappings to put in place can imo not be derived from
> ->iomem_caps, as we merely subtract certain ranges after initially having
> set all bits in it. Besides ranges not mapping any MMIO, even something
> like the PCI ECAM ranges (parts of which we may also force to r/o, and
> which we would hence cover here if I followed your suggestion) are
> questionable in this regard.

Right, ->iomem_caps is indeed too wide for our purpose.  What
about using something like:

else if ( is_pv_domain(d) )
{
    if ( !iomem_access_permitted(d, pfn, pfn) )
        return 0;
    if ( rangeset_contains_singleton(mmio_ro_ranges, pfn) )
        return IOMMUF_readable;
}

That would get us a bit closer to allowed CPU side mappings, and we
don't need to special case IO-APIC or HPET addresses as those are
already added to ->iomem_caps or mmio_ro_ranges respectively by
dom0_setup_permissions().

Thanks, Roger.

Reply via email to