Re: [Xen-devel] [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges

2018-08-09 Thread Jan Beulich
>>> On 09.08.18 at 12:23,  wrote:
> On Thu, Aug 09, 2018 at 01:36:57AM -0600, Jan Beulich wrote:
>> >>> On 08.08.18 at 18:18,  wrote:
>> > On Wed, Aug 08, 2018 at 07:15:54AM -0600, Jan Beulich wrote:
>> >> >>> On 08.08.18 at 12:07,  wrote:
>> >> > +/*
>> >> > + * If dom0-strict mode is enabled or the guest type is PVH/HVM 
>> >> > then exclude
>> >> > + * conventional RAM and let the common code map dom0's pages.
>> >> > + */
>> >> > +if ( page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
>> >> > + (iommu_dom0_strict || is_hvm_domain(d)) )
>> >> > +return false;
>> >> > +if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
>> >> > + !iommu_dom0_reserved && !iommu_dom0_inclusive )
>> >> > +return false;
>> >> > +if ( !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE) &&
>> >> > + !page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
>> >> > + !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
>> >> > + (!iommu_dom0_inclusive || pfn > max_pfn) )
>> >> > +return false;
>> >> 
>> >> As page_is_ram_type() is, especially on systems with many E820
>> >> entries, not really cheap, I think at least a minimum amount of
>> >> optimization is on order here - after all you do this for every
>> >> single page of the system. Hence minimally the result of the first
>> >> CONVENTIONAL and RESERVED queries should be latched and
>> >> re-used (or "else" be used suitably). Ideally an approach would
>> >> be used which involved just a single iteration through the E820
>> >> map, but I realize this may be more than is feasible here.
>> > 
>> > This would be quite better if I could just fetch the type, then I
>> > could add a switch (type) { ... and it would be better IMO.
>> 
>> Except that, as said, there's no "the" type for an entire page.
>> Only a single byte can have an exact type.
> 
> Right, I don0t think the original code was much better in that regard
> anyway, neither I'm sure about how to handle this any better.

And I'm not demanding that you improve it beyond what's there
at this point. But we need to be reasonably certain it doesn't
regress.

Jan



___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges

2018-08-09 Thread Roger Pau Monné
On Thu, Aug 09, 2018 at 01:36:57AM -0600, Jan Beulich wrote:
> >>> On 08.08.18 at 18:18,  wrote:
> > On Wed, Aug 08, 2018 at 07:15:54AM -0600, Jan Beulich wrote:
> >> >>> On 08.08.18 at 12:07,  wrote:
> >> > @@ -134,13 +135,67 @@ void arch_iommu_domain_destroy(struct domain *d)
> >> >  {
> >> >  }
> >> >  
> >> > +static bool __hwdom_init hwdom_iommu_map(const struct domain *d, 
> >> > unsigned long pfn,
> >> > + unsigned long max_pfn)
> >> > +{
> >> > +unsigned int i;
> >> > +
> >> > +/*
> >> > + * Ignore any address below 1MB, that's already identity mapped by 
> >> > the
> >> > + * domain builder for HVM.
> >> > + */
> >> > +if ( (is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||
> >> 
> >> Careful again here with the distinction between Dom0 and hwdom:
> >> The domain builder you refer to is - aiui - the in-Xen one, i.e. the
> >> one _only_ dealing with Dom0.
> > 
> > So this should instead be:
> > 
> > if ( (is_control_domain(d) && is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||
> 
> From an abstract perspective (long term plans) is_control_domain()
> can be true for multiple domains, none of which may be Dom0 or
> hwdom. So no, I don't think the addition would help in any way.
> With the reference to the in-Xen domain builder, I think you really
> mean Dom0 here.

OK, I will check against the domid then.

> >> > +/*
> >> > + * If dom0-strict mode is enabled or the guest type is PVH/HVM then 
> >> > exclude
> >> > + * conventional RAM and let the common code map dom0's pages.
> >> > + */
> >> > +if ( page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
> >> > + (iommu_dom0_strict || is_hvm_domain(d)) )
> >> > +return false;
> >> > +if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
> >> > + !iommu_dom0_reserved && !iommu_dom0_inclusive )
> >> > +return false;
> >> > +if ( !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE) &&
> >> > + !page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
> >> > + !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
> >> > + (!iommu_dom0_inclusive || pfn > max_pfn) )
> >> > +return false;
> >> 
> >> As page_is_ram_type() is, especially on systems with many E820
> >> entries, not really cheap, I think at least a minimum amount of
> >> optimization is on order here - after all you do this for every
> >> single page of the system. Hence minimally the result of the first
> >> CONVENTIONAL and RESERVED queries should be latched and
> >> re-used (or "else" be used suitably). Ideally an approach would
> >> be used which involved just a single iteration through the E820
> >> map, but I realize this may be more than is feasible here.
> > 
> > This would be quite better if I could just fetch the type, then I
> > could add a switch (type) { ... and it would be better IMO.
> 
> Except that, as said, there's no "the" type for an entire page.
> Only a single byte can have an exact type.

Right, I don0t think the original code was much better in that regard
anyway, neither I'm sure about how to handle this any better.

> >> Furthermore I'm unconvinced the !page_is_ram_type() uses
> >> are really what you want: The function returning false means
> >> "don't know", not "no". Perhaps the function needs to be made
> >> return a tristate (yes, no, or part of it).
> > 
> > Right, that's why I have three different !page_is_ram_type checks in
> > the last branch of the if, so that I can make sure it's not one of the
> > previous types and also account for holes.
> 
> I'm afraid I don't understand. Take the example of a single
> page being split in an unusable and a reserved part. Both
> respective function invocations will return false. Yet you
> want to exclude both unusable and reserved types when
> !iommu_dom0_inclusive, and hence your goal would be to
> exclude that page here.

Right, but I'm not sure how to fix this given the current interfaces.
I plan to introduce something like page_get_type() that returns the
whole type for a page in a similar fashion to what page_is_ram_type
currently does, but this is not going to solve the issue of a page
having different types.

> As to unusable - don't you break original behavior here
> anyway? Shouldn't the function return false when
> page_is_ram_type(pfn, RAM_TYPE_UNUSABLE), effectively
> independent of any command line options?

Yes, I think so, will fix it in the next version.

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges

2018-08-09 Thread Jan Beulich
>>> On 08.08.18 at 18:18,  wrote:
> On Wed, Aug 08, 2018 at 07:15:54AM -0600, Jan Beulich wrote:
>> >>> On 08.08.18 at 12:07,  wrote:
>> > Several people have reported hardware issues (malfunctioning USB
>> > controllers) due to iommu page faults on Intel hardware. Those faults
>> > are caused by missing RMRR (VTd) entries in the ACPI tables. Those can
>> > be worked around on VTd hardware by manually adding RMRR entries on
>> > the command line, this is however limited to Intel hardware and quite
>> > cumbersome to do.
>> > 
>> > In order to solve those issues add a new dom0-iommu=reserved option
>> > that identity maps all regions marked as reserved in the memory map.
>> 
>> Considering that report we've had (yesterday? earlier today?), don't
>> we need to go further and use the union of RMRRs and reserved
>> regions? Iirc they had a case where an RMRR was not in a reserved
>> region ...
> 
> AFAICT (and I could be reading the code wrong) RMRR regions not on
> reserved regions are still correctly mapped to the guest.

Oh, yes, you're right - the logged message is really just that,
with no other enforced effects.

>> > --- a/docs/misc/xen-command-line.markdown
>> > +++ b/docs/misc/xen-command-line.markdown
>> > @@ -1205,7 +1205,7 @@ detection of systems known to misbehave upon 
>> > accesses 
>> > to that port.
>> >  >> Enable IOMMU debugging code (implies `verbose`).
>> >  
>> >  ### dom0-iommu
>> > -> `= List of [ none | strict | relaxed | inclusive ]`
>> > +> `= List of [ none | strict | relaxed | inclusive | reserved ]`
>> >  
>> >  * `none`: disables DMA remapping for Dom0.
>> >  
>> > @@ -1233,6 +1233,15 @@ meaning:
>> >option is only applicable to a PV Dom0 and is enabled by default on 
>> > Intel
>> >hardware.
>> >  
>> > +* `reserved`: sets up DMA remapping for all the reserved regions in the  
>> > memory
>> > +  map for Dom0. Use this to work around firmware issues providing 
>> > incorrect
>> > +  RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU accesses
>> > +  for Dom0, all memory regions marked as reserved in the memory map that 
>> > don't
>> > +  overlap with any MMIO region from emulated devices will be identity 
>> > mapped.
>> > +  This option maps a subset of the memory that would be mapped when using 
>> > the
>> > +  `inclusive` option. This option is available to a PVH Dom0 and is 
>> > enabled by
>> > +  default on Intel hardware.
>> 
>> The sub-options so far were quite clear in their meanings, but
>> "dom0-iommu=reserved" might mean all sorts of things to me, but quite
>> certainly not "map all reserved regions". "map-reserved" perhaps?
> 
> Then we should also have 'map-inclusive' for symmetry IMO.

I don't mind at all such symmetry to be established.

>> > @@ -134,13 +135,67 @@ void arch_iommu_domain_destroy(struct domain *d)
>> >  {
>> >  }
>> >  
>> > +static bool __hwdom_init hwdom_iommu_map(const struct domain *d, unsigned 
>> > long pfn,
>> > + unsigned long max_pfn)
>> > +{
>> > +unsigned int i;
>> > +
>> > +/*
>> > + * Ignore any address below 1MB, that's already identity mapped by the
>> > + * domain builder for HVM.
>> > + */
>> > +if ( (is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||
>> 
>> Careful again here with the distinction between Dom0 and hwdom:
>> The domain builder you refer to is - aiui - the in-Xen one, i.e. the
>> one _only_ dealing with Dom0.
> 
> So this should instead be:
> 
> if ( (is_control_domain(d) && is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||

From an abstract perspective (long term plans) is_control_domain()
can be true for multiple domains, none of which may be Dom0 or
hwdom. So no, I don't think the addition would help in any way.
With the reference to the in-Xen domain builder, I think you really
mean Dom0 here.

>> > +/*
>> > + * If dom0-strict mode is enabled or the guest type is PVH/HVM then 
>> > exclude
>> > + * conventional RAM and let the common code map dom0's pages.
>> > + */
>> > +if ( page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
>> > + (iommu_dom0_strict || is_hvm_domain(d)) )
>> > +return false;
>> > +if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
>> > + !iommu_dom0_reserved && !iommu_dom0_inclusive )
>> > +return false;
>> > +if ( !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE) &&
>> > + !page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
>> > + !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
>> > + (!iommu_dom0_inclusive || pfn > max_pfn) )
>> > +return false;
>> 
>> As page_is_ram_type() is, especially on systems with many E820
>> entries, not really cheap, I think at least a minimum amount of
>> optimization is on order here - after all you do this for every
>> single page of the system. Hence minimally the result of the first
>> CONVENTIONAL and RESERVED queries should be latched and
>> re-used (or "else" be used suitably). Ideally an approach would
>> 

Re: [Xen-devel] [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges

2018-08-08 Thread Roger Pau Monné
On Wed, Aug 08, 2018 at 07:15:54AM -0600, Jan Beulich wrote:
> >>> On 08.08.18 at 12:07,  wrote:
> > Several people have reported hardware issues (malfunctioning USB
> > controllers) due to iommu page faults on Intel hardware. Those faults
> > are caused by missing RMRR (VTd) entries in the ACPI tables. Those can
> > be worked around on VTd hardware by manually adding RMRR entries on
> > the command line, this is however limited to Intel hardware and quite
> > cumbersome to do.
> > 
> > In order to solve those issues add a new dom0-iommu=reserved option
> > that identity maps all regions marked as reserved in the memory map.
> 
> Considering that report we've had (yesterday? earlier today?), don't
> we need to go further and use the union of RMRRs and reserved
> regions? Iirc they had a case where an RMRR was not in a reserved
> region ...

AFAICT (and I could be reading the code wrong) RMRR regions not on
reserved regions are still correctly mapped to the guest.

> > --- a/docs/misc/xen-command-line.markdown
> > +++ b/docs/misc/xen-command-line.markdown
> > @@ -1205,7 +1205,7 @@ detection of systems known to misbehave upon accesses 
> > to that port.
> >  >> Enable IOMMU debugging code (implies `verbose`).
> >  
> >  ### dom0-iommu
> > -> `= List of [ none | strict | relaxed | inclusive ]`
> > +> `= List of [ none | strict | relaxed | inclusive | reserved ]`
> >  
> >  * `none`: disables DMA remapping for Dom0.
> >  
> > @@ -1233,6 +1233,15 @@ meaning:
> >option is only applicable to a PV Dom0 and is enabled by default on Intel
> >hardware.
> >  
> > +* `reserved`: sets up DMA remapping for all the reserved regions in the  
> > memory
> > +  map for Dom0. Use this to work around firmware issues providing incorrect
> > +  RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU accesses
> > +  for Dom0, all memory regions marked as reserved in the memory map that 
> > don't
> > +  overlap with any MMIO region from emulated devices will be identity 
> > mapped.
> > +  This option maps a subset of the memory that would be mapped when using 
> > the
> > +  `inclusive` option. This option is available to a PVH Dom0 and is 
> > enabled by
> > +  default on Intel hardware.
> 
> The sub-options so far were quite clear in their meanings, but
> "dom0-iommu=reserved" might mean all sorts of things to me, but quite
> certainly not "map all reserved regions". "map-reserved" perhaps?

Then we should also have 'map-inclusive' for symmetry IMO.

> > --- a/xen/arch/x86/hvm/io.c
> > +++ b/xen/arch/x86/hvm/io.c
> > @@ -404,6 +404,11 @@ static const struct hvm_mmcfg *vpci_mmcfg_find(const 
> > struct domain *d,
> >  return NULL;
> >  }
> >  
> > +bool vpci_mmcfg_address(const struct domain *d, paddr_t addr)
> > +{
> > +return vpci_mmcfg_find(d, addr);
> > +}
> 
> I think the function name should have an "is_" somewhere, to make
> clear address is a noun here and not a verb.
> 
> > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> > @@ -256,6 +256,9 @@ static void __hwdom_init amd_iommu_hwdom_init(struct 
> > domain *d)
> >  /* Inclusive IOMMU mappings are disabled by default on AMD hardware. */
> >  iommu_dom0_inclusive = iommu_dom0_inclusive == -1 ? false
> >: 
> > iommu_dom0_inclusive;
> > +/* Reserved IOMMU mappings are disabled by default on AMD hardware. */
> > +iommu_dom0_reserved = iommu_dom0_reserved == -1 ? false
> > +: iommu_dom0_reserved;
> 
> Especially seeing these two together now, I think an if() each would
> be easier to read (not the least because of being shorter). To me,
> the main purpose of the conditional operator is to allow shortening
> simple if/else pairs, rather than lengthening simple if()-s.
> 
> > @@ -134,13 +135,67 @@ void arch_iommu_domain_destroy(struct domain *d)
> >  {
> >  }
> >  
> > +static bool __hwdom_init hwdom_iommu_map(const struct domain *d, unsigned 
> > long pfn,
> > + unsigned long max_pfn)
> > +{
> > +unsigned int i;
> > +
> > +/*
> > + * Ignore any address below 1MB, that's already identity mapped by the
> > + * domain builder for HVM.
> > + */
> > +if ( (is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||
> 
> Careful again here with the distinction between Dom0 and hwdom:
> The domain builder you refer to is - aiui - the in-Xen one, i.e. the
> one _only_ dealing with Dom0.

So this should instead be:

if ( (is_control_domain(d) && is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||

> > +/*
> > + * If dom0-strict mode is enabled or the guest type is PVH/HVM then 
> > exclude
> > + * conventional RAM and let the common code map dom0's pages.
> > + */
> > +if ( page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
> > + (iommu_dom0_strict || is_hvm_domain(d)) )
> > +return false;
> > +if ( 

Re: [Xen-devel] [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges

2018-08-08 Thread Jan Beulich
>>> On 08.08.18 at 12:07,  wrote:
> Several people have reported hardware issues (malfunctioning USB
> controllers) due to iommu page faults on Intel hardware. Those faults
> are caused by missing RMRR (VTd) entries in the ACPI tables. Those can
> be worked around on VTd hardware by manually adding RMRR entries on
> the command line, this is however limited to Intel hardware and quite
> cumbersome to do.
> 
> In order to solve those issues add a new dom0-iommu=reserved option
> that identity maps all regions marked as reserved in the memory map.

Considering that report we've had (yesterday? earlier today?), don't
we need to go further and use the union of RMRRs and reserved
regions? Iirc they had a case where an RMRR was not in a reserved
region ...

> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1205,7 +1205,7 @@ detection of systems known to misbehave upon accesses 
> to that port.
>  >> Enable IOMMU debugging code (implies `verbose`).
>  
>  ### dom0-iommu
> -> `= List of [ none | strict | relaxed | inclusive ]`
> +> `= List of [ none | strict | relaxed | inclusive | reserved ]`
>  
>  * `none`: disables DMA remapping for Dom0.
>  
> @@ -1233,6 +1233,15 @@ meaning:
>option is only applicable to a PV Dom0 and is enabled by default on Intel
>hardware.
>  
> +* `reserved`: sets up DMA remapping for all the reserved regions in the  
> memory
> +  map for Dom0. Use this to work around firmware issues providing incorrect
> +  RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU accesses
> +  for Dom0, all memory regions marked as reserved in the memory map that 
> don't
> +  overlap with any MMIO region from emulated devices will be identity mapped.
> +  This option maps a subset of the memory that would be mapped when using the
> +  `inclusive` option. This option is available to a PVH Dom0 and is enabled 
> by
> +  default on Intel hardware.

The sub-options so far were quite clear in their meanings, but
"dom0-iommu=reserved" might mean all sorts of things to me, but quite
certainly not "map all reserved regions". "map-reserved" perhaps?

> --- a/xen/arch/x86/hvm/io.c
> +++ b/xen/arch/x86/hvm/io.c
> @@ -404,6 +404,11 @@ static const struct hvm_mmcfg *vpci_mmcfg_find(const 
> struct domain *d,
>  return NULL;
>  }
>  
> +bool vpci_mmcfg_address(const struct domain *d, paddr_t addr)
> +{
> +return vpci_mmcfg_find(d, addr);
> +}

I think the function name should have an "is_" somewhere, to make
clear address is a noun here and not a verb.

> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
> @@ -256,6 +256,9 @@ static void __hwdom_init amd_iommu_hwdom_init(struct 
> domain *d)
>  /* Inclusive IOMMU mappings are disabled by default on AMD hardware. */
>  iommu_dom0_inclusive = iommu_dom0_inclusive == -1 ? false
>: iommu_dom0_inclusive;
> +/* Reserved IOMMU mappings are disabled by default on AMD hardware. */
> +iommu_dom0_reserved = iommu_dom0_reserved == -1 ? false
> +: iommu_dom0_reserved;

Especially seeing these two together now, I think an if() each would
be easier to read (not the least because of being shorter). To me,
the main purpose of the conditional operator is to allow shortening
simple if/else pairs, rather than lengthening simple if()-s.

> @@ -134,13 +135,67 @@ void arch_iommu_domain_destroy(struct domain *d)
>  {
>  }
>  
> +static bool __hwdom_init hwdom_iommu_map(const struct domain *d, unsigned 
> long pfn,
> + unsigned long max_pfn)
> +{
> +unsigned int i;
> +
> +/*
> + * Ignore any address below 1MB, that's already identity mapped by the
> + * domain builder for HVM.
> + */
> +if ( (is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ||

Careful again here with the distinction between Dom0 and hwdom:
The domain builder you refer to is - aiui - the in-Xen one, i.e. the
one _only_ dealing with Dom0.

> +/*
> + * If dom0-strict mode is enabled or the guest type is PVH/HVM then 
> exclude
> + * conventional RAM and let the common code map dom0's pages.
> + */
> +if ( page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
> + (iommu_dom0_strict || is_hvm_domain(d)) )
> +return false;
> +if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
> + !iommu_dom0_reserved && !iommu_dom0_inclusive )
> +return false;
> +if ( !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE) &&
> + !page_is_ram_type(pfn, RAM_TYPE_RESERVED) &&
> + !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) &&
> + (!iommu_dom0_inclusive || pfn > max_pfn) )
> +return false;

As page_is_ram_type() is, especially on systems with many E820
entries, not really cheap, I think at least a minimum amount of
optimization is on order here - after all you do this for every

[Xen-devel] [PATCH v4 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges

2018-08-08 Thread Roger Pau Monne
Several people have reported hardware issues (malfunctioning USB
controllers) due to iommu page faults on Intel hardware. Those faults
are caused by missing RMRR (VTd) entries in the ACPI tables. Those can
be worked around on VTd hardware by manually adding RMRR entries on
the command line, this is however limited to Intel hardware and quite
cumbersome to do.

In order to solve those issues add a new dom0-iommu=reserved option
that identity maps all regions marked as reserved in the memory map.
Note that regions used by devices emulated by Xen (LAPIC, IO-APIC or
PCIe MCFG regions) are specifically avoided. Note that this option is
available to a PVH Dom0 (as opposed to the inclusive option which only
works for PV Dom0).

Signed-off-by: Roger Pau Monné 
Reviewed-by: Paul Durrant 
---
Changes since v3:
 - Add mappings if the iommu page tables are shared.

Changes since v2:
 - Fix comment regarding dom0-strict.
 - Change documentation style of xen command line.
 - Rename iommu_map to hwdom_iommu_map.
 - Move all the checks to hwdom_iommu_map.

Changes since v1:
 - Introduce a new reserved option instead of abusing the inclusive
   option.
 - Use the same helper function for PV and PVH in order to decide if a
   page should be added to the domain page tables.
 - Use the data inside of the domain struct to detect overlaps with
   emulated MMIO regions.
---
Cc: Andrew Cooper 
Cc: George Dunlap 
Cc: Ian Jackson 
Cc: Jan Beulich 
Cc: Julien Grall 
Cc: Konrad Rzeszutek Wilk 
Cc: Stefano Stabellini 
Cc: Tim Deegan 
Cc: Wei Liu 
---
 docs/misc/xen-command-line.markdown | 11 ++-
 xen/arch/x86/hvm/io.c   |  5 ++
 xen/drivers/passthrough/amd/pci_amd_iommu.c |  3 +
 xen/drivers/passthrough/iommu.c |  3 +
 xen/drivers/passthrough/vtd/iommu.c |  3 +
 xen/drivers/passthrough/x86/iommu.c | 92 ++---
 xen/include/asm-x86/hvm/io.h|  3 +
 xen/include/xen/iommu.h |  2 +-
 8 files changed, 91 insertions(+), 31 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown 
b/docs/misc/xen-command-line.markdown
index 90b32fe3f0..59ec2afc5d 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1205,7 +1205,7 @@ detection of systems known to misbehave upon accesses to 
that port.
 >> Enable IOMMU debugging code (implies `verbose`).
 
 ### dom0-iommu
-> `= List of [ none | strict | relaxed | inclusive ]`
+> `= List of [ none | strict | relaxed | inclusive | reserved ]`
 
 * `none`: disables DMA remapping for Dom0.
 
@@ -1233,6 +1233,15 @@ meaning:
   option is only applicable to a PV Dom0 and is enabled by default on Intel
   hardware.
 
+* `reserved`: sets up DMA remapping for all the reserved regions in the memory
+  map for Dom0. Use this to work around firmware issues providing incorrect
+  RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU accesses
+  for Dom0, all memory regions marked as reserved in the memory map that don't
+  overlap with any MMIO region from emulated devices will be identity mapped.
+  This option maps a subset of the memory that would be mapped when using the
+  `inclusive` option. This option is available to a PVH Dom0 and is enabled by
+  default on Intel hardware.
+
 ### iommu\_dev\_iotlb\_timeout
 > `= `
 
diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c
index bf4d8748d3..5e01c33890 100644
--- a/xen/arch/x86/hvm/io.c
+++ b/xen/arch/x86/hvm/io.c
@@ -404,6 +404,11 @@ static const struct hvm_mmcfg *vpci_mmcfg_find(const 
struct domain *d,
 return NULL;
 }
 
+bool vpci_mmcfg_address(const struct domain *d, paddr_t addr)
+{
+return vpci_mmcfg_find(d, addr);
+}
+
 static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg *mmcfg,
paddr_t addr, pci_sbdf_t *sbdf)
 {
diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c 
b/xen/drivers/passthrough/amd/pci_amd_iommu.c
index 0e0c99c942..2c2867d088 100644
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
@@ -256,6 +256,9 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain 
*d)
 /* Inclusive IOMMU mappings are disabled by default on AMD hardware. */
 iommu_dom0_inclusive = iommu_dom0_inclusive == -1 ? false
   : iommu_dom0_inclusive;
+/* Reserved IOMMU mappings are disabled by default on AMD hardware. */
+iommu_dom0_reserved = iommu_dom0_reserved == -1 ? false
+: iommu_dom0_reserved;
 
 if ( allocate_domain_resources(dom_iommu(d)) )
 BUG();
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index f15c94be42..9c991bd2cf 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -75,6 +75,7 @@ custom_param("dom0-iommu", parse_dom0_iommu_param);
 bool __hwdom_initdata iommu_dom0_strict;
 bool