[Xen-devel] [PATCH v4 2/4] iommu: make iommu_inclusive_mapping a suboption of dom0-iommu
Introduce a new dom0-iommu=inclusive generic option that supersedes iommu_inclusive_mapping. The previous behaviour is preserved and the option should only be enabled by default on Intel hardware. No functional change intended. Signed-off-by: Roger Pau Monné Reviewed-by: Paul Durrant --- Changes since v2: - Fix typo in commit message. - Change style and text of the documentation in xen command line. - Set the defaults in {intel/amd}_iommu_hwdom_init for inclusive. - Re-add the iommu_dom0_passthrough || !is_pv_domain(d) check. Changes since v1: - Use dom0-iommu instead of the iommu option. - Only enable by default on Intel hardware. --- Cc: Andrew Cooper Cc: George Dunlap Cc: Ian Jackson Cc: Jan Beulich Cc: Julien Grall Cc: Konrad Rzeszutek Wilk Cc: Stefano Stabellini Cc: Tim Deegan Cc: Wei Liu Cc: Kevin Tian --- docs/misc/xen-command-line.markdown | 17 +- xen/drivers/passthrough/amd/pci_amd_iommu.c | 4 ++ xen/drivers/passthrough/arm/iommu.c | 4 ++ xen/drivers/passthrough/iommu.c | 23 ++-- xen/drivers/passthrough/vtd/extern.h| 2 - xen/drivers/passthrough/vtd/iommu.c | 8 ++- xen/drivers/passthrough/vtd/x86/vtd.c | 58 +--- xen/drivers/passthrough/x86/iommu.c | 59 + xen/include/xen/iommu.h | 2 + 9 files changed, 109 insertions(+), 68 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index ea451f088e..90b32fe3f0 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -1205,7 +1205,7 @@ detection of systems known to misbehave upon accesses to that port. >> Enable IOMMU debugging code (implies `verbose`). ### dom0-iommu -> `= List of [ none | strict | relaxed ]` +> `= List of [ none | strict | relaxed | inclusive ]` * `none`: disables DMA remapping for Dom0. @@ -1221,6 +1221,18 @@ PV Dom0: Note that all the above options are mutually exclusive. Specifying more than one on the `dom0-iommu` command line will result in undefined behavior. +The following options control whether non-RAM regions are added to the Dom0 +iommu tables. Note that they can be prefixed with `no-` to effect the inverse +meaning: + +* `inclusive`: sets up DMA remapping for all the non-RAM memory below 4GB + except for unusable ranges. Use this to work around firmware issues providing + incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU + accesses for Dom0, with this option all pages up to 4GB, not marked as + unusable in the E820 table, will get a mapping established. Note that this + option is only applicable to a PV Dom0 and is enabled by default on Intel + hardware. + ### iommu\_dev\_iotlb\_timeout > `= ` @@ -1233,6 +1245,9 @@ wait descriptor timed out', try increasing this value. ### iommu\_inclusive\_mapping (VT-d) > `= ` +**WARNING: This command line option is deprecated, and superseded by +_dom0-iommu=inclusive_ - using both options in combination is undefined.** + > Default: `true` Use this to work around firmware issues providing incorrect RMRR entries. diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c index eeacf713e4..0e0c99c942 100644 --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -253,6 +253,10 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d) unsigned long i; const struct amd_iommu *iommu; +/* Inclusive IOMMU mappings are disabled by default on AMD hardware. */ +iommu_dom0_inclusive = iommu_dom0_inclusive == -1 ? false + : iommu_dom0_inclusive; + if ( allocate_domain_resources(dom_iommu(d)) ) BUG(); diff --git a/xen/drivers/passthrough/arm/iommu.c b/xen/drivers/passthrough/arm/iommu.c index 95b1abb972..325997b19f 100644 --- a/xen/drivers/passthrough/arm/iommu.c +++ b/xen/drivers/passthrough/arm/iommu.c @@ -73,3 +73,7 @@ int arch_iommu_populate_page_table(struct domain *d) /* The IOMMU shares the p2m with the CPU */ return -ENOSYS; } + +void __hwdom_init arch_iommu_hwdom_init(struct domain *d) +{ +} diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index 830560bdcf..f15c94be42 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -74,6 +74,7 @@ bool_t __read_mostly amd_iommu_perdev_intremap = 1; custom_param("dom0-iommu", parse_dom0_iommu_param); bool __hwdom_initdata iommu_dom0_strict; bool __read_mostly iommu_dom0_passthrough; +int8_t __hwdom_initdata iommu_dom0_inclusive = -1; DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb); @@ -144,16 +145,23 @@ static int __init parse_dom0_iommu_param(const char *s) int rc = 0; do { +bool val = !!strncmp(s, "no-", 3); + +if ( !val ) +s += 3; + ss = strchr(
Re: [Xen-devel] [PATCH v4 2/4] iommu: make iommu_inclusive_mapping a suboption of dom0-iommu
>>> On 08.08.18 at 12:07, wrote: > Introduce a new dom0-iommu=inclusive generic option that supersedes > iommu_inclusive_mapping. The previous behaviour is preserved and the > option should only be enabled by default on Intel hardware. Why "should" instead of "is"? > @@ -1221,6 +1221,18 @@ PV Dom0: > Note that all the above options are mutually exclusive. Specifying more than > one on the `dom0-iommu` command line will result in undefined behavior. > > +The following options control whether non-RAM regions are added to the Dom0 > +iommu tables. Note that they can be prefixed with `no-` to effect the inverse > +meaning: I'm not particularly happy about the mentioning of "no-" here: Why is this better than the also permitted "=0" etc suffixes? Keep it generic, just like other options do. > +* `inclusive`: sets up DMA remapping for all the non-RAM memory below 4GB > + except for unusable ranges. Use this to work around firmware issues > providing > + incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU > + accesses for Dom0, with this option all pages up to 4GB, not marked as > + unusable in the E820 table, will get a mapping established. Note that this > + option is only applicable to a PV Dom0 and is enabled by default on Intel > + hardware. No word at all about the interaction with none/strict/relaxed? I think, as mentioned for patch 1, "none" renders this option meaningless as well. But for "relaxed" it's pretty unclear, because from E820 alone you can't judge whether e.g. a reserved region is RAM or MMIO. (As an implication, the mentioning of RAM in patch 1's doc for "relaxed" then looks symmetrically wrong, just like I've already asked to replace "memory" by "RAM" for "strict".) > --- a/xen/drivers/passthrough/arm/iommu.c > +++ b/xen/drivers/passthrough/arm/iommu.c > @@ -73,3 +73,7 @@ int arch_iommu_populate_page_table(struct domain *d) > /* The IOMMU shares the p2m with the CPU */ > return -ENOSYS; > } > + > +void __hwdom_init arch_iommu_hwdom_init(struct domain *d) > +{ > +} The option being in common code, I think you also need to set it for ARM, so it won't remain at its default of -1. > @@ -144,16 +145,23 @@ static int __init parse_dom0_iommu_param(const char *s) > int rc = 0; > > do { > +bool val = !!strncmp(s, "no-", 3); Oh, you do a literal comparison against "no-". Please don't, that's what we have parse_boolean() for. > @@ -202,6 +210,13 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > if ( !iommu_enabled ) > return; > > +if ( iommu_dom0_inclusive == true && !is_pv_domain(d) ) Why the "== true"? It shouldn't have its initializer value of -1 anymore at this point. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 2/4] iommu: make iommu_inclusive_mapping a suboption of dom0-iommu
On Wed, Aug 08, 2018 at 06:32:00AM -0600, Jan Beulich wrote: > >>> On 08.08.18 at 12:07, wrote: > > Introduce a new dom0-iommu=inclusive generic option that supersedes > > iommu_inclusive_mapping. The previous behaviour is preserved and the > > option should only be enabled by default on Intel hardware. > > Why "should" instead of "is"? > > > @@ -1221,6 +1221,18 @@ PV Dom0: > > Note that all the above options are mutually exclusive. Specifying more > > than > > one on the `dom0-iommu` command line will result in undefined behavior. > > > > +The following options control whether non-RAM regions are added to the Dom0 > > +iommu tables. Note that they can be prefixed with `no-` to effect the > > inverse > > +meaning: > > I'm not particularly happy about the mentioning of "no-" here: Why is > this better than the also permitted "=0" etc suffixes? Keep it generic, > just like other options do. Oh, I've just copied this text from the iommu option. Should I change this to: "The following boolean options control whether non-RAM regions are added to the Dom0 iommu tables:" > > +* `inclusive`: sets up DMA remapping for all the non-RAM memory below 4GB > > + except for unusable ranges. Use this to work around firmware issues > > providing > > + incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU > > + accesses for Dom0, with this option all pages up to 4GB, not marked as > > + unusable in the E820 table, will get a mapping established. Note that > > this > > + option is only applicable to a PV Dom0 and is enabled by default on Intel > > + hardware. > > No word at all about the interaction with none/strict/relaxed? I think, > as mentioned for patch 1, "none" renders this option meaningless as > well. But for "relaxed" it's pretty unclear, because from E820 alone > you can't judge whether e.g. a reserved region is RAM or MMIO. (As > an implication, the mentioning of RAM in patch 1's doc for "relaxed" > then looks symmetrically wrong, just like I've already asked to replace > "memory" by "RAM" for "strict".) Hm, I'm not sure I follow. 'relaxed' refers to how the regions marked as RAM in the memory map (E820_RAM) will be mapped to the guest. The inclusive option OTOH refers to if/how non-RAM regions in the memory map will be mapped to the guest. It doesn't matter if a reserved region (E820_RESERVED) is actually backed by RAM or MMIO, it will be mapped to the guest because it's a reserved region on the memory map. > > --- a/xen/drivers/passthrough/arm/iommu.c > > +++ b/xen/drivers/passthrough/arm/iommu.c > > @@ -73,3 +73,7 @@ int arch_iommu_populate_page_table(struct domain *d) > > /* The IOMMU shares the p2m with the CPU */ > > return -ENOSYS; > > } > > + > > +void __hwdom_init arch_iommu_hwdom_init(struct domain *d) > > +{ > > +} > > The option being in common code, I think you also need to set it for > ARM, so it won't remain at its default of -1. > > > @@ -144,16 +145,23 @@ static int __init parse_dom0_iommu_param(const char > > *s) > > int rc = 0; > > > > do { > > +bool val = !!strncmp(s, "no-", 3); > > Oh, you do a literal comparison against "no-". Please don't, that's what > we have parse_boolean() for. Thanks, I will fix it. I've mostly cloned what the iommu option currently does. > > @@ -202,6 +210,13 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > > if ( !iommu_enabled ) > > return; > > > > +if ( iommu_dom0_inclusive == true && !is_pv_domain(d) ) > > Why the "== true"? It shouldn't have its initializer value of -1 anymore > at this point. It can have a value of -1, AFAICT the default values are set by hd->platform_ops->hwdom_init which is called later in this function. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v4 2/4] iommu: make iommu_inclusive_mapping a suboption of dom0-iommu
>>> On 08.08.18 at 18:09, wrote: > On Wed, Aug 08, 2018 at 06:32:00AM -0600, Jan Beulich wrote: >> >>> On 08.08.18 at 12:07, wrote: >> > Introduce a new dom0-iommu=inclusive generic option that supersedes >> > iommu_inclusive_mapping. The previous behaviour is preserved and the >> > option should only be enabled by default on Intel hardware. >> >> Why "should" instead of "is"? >> >> > @@ -1221,6 +1221,18 @@ PV Dom0: >> > Note that all the above options are mutually exclusive. Specifying more >> > than >> > one on the `dom0-iommu` command line will result in undefined behavior. >> > >> > +The following options control whether non-RAM regions are added to the >> > Dom0 >> > +iommu tables. Note that they can be prefixed with `no-` to effect the >> > inverse >> > +meaning: >> >> I'm not particularly happy about the mentioning of "no-" here: Why is >> this better than the also permitted "=0" etc suffixes? Keep it generic, >> just like other options do. > > Oh, I've just copied this text from the iommu option. Should I change > this to: > > "The following boolean options control whether non-RAM regions are > added to the Dom0 iommu tables:" Yes please, much better. >> > +* `inclusive`: sets up DMA remapping for all the non-RAM memory below 4GB >> > + except for unusable ranges. Use this to work around firmware issues >> > providing >> > + incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for >> > IOMMU >> > + accesses for Dom0, with this option all pages up to 4GB, not marked as >> > + unusable in the E820 table, will get a mapping established. Note that >> > this >> > + option is only applicable to a PV Dom0 and is enabled by default on >> > Intel >> > + hardware. >> >> No word at all about the interaction with none/strict/relaxed? I think, >> as mentioned for patch 1, "none" renders this option meaningless as >> well. But for "relaxed" it's pretty unclear, because from E820 alone >> you can't judge whether e.g. a reserved region is RAM or MMIO. (As >> an implication, the mentioning of RAM in patch 1's doc for "relaxed" >> then looks symmetrically wrong, just like I've already asked to replace >> "memory" by "RAM" for "strict".) > > Hm, I'm not sure I follow. 'relaxed' refers to how the regions marked > as RAM in the memory map (E820_RAM) will be mapped to the guest. Hmm, it may well be that I'm (once again) confused with the various options and their precise effects; perhaps I didn't pay (enough) attention to the use of iommu_inclusive_mapping in vtd_set_hwdom_mapping(). >> > @@ -202,6 +210,13 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) >> > if ( !iommu_enabled ) >> > return; >> > >> > +if ( iommu_dom0_inclusive == true && !is_pv_domain(d) ) >> >> Why the "== true"? It shouldn't have its initializer value of -1 anymore >> at this point. > > It can have a value of -1, AFAICT the default values are set by > hd->platform_ops->hwdom_init which is called later in this function. Hmm, I see. But that's not very nice. Can't you move this check between the ->hwdom_init() hook invocation and the call to arch_iommu_hwdom_init() that you add? The latter is the only place where it's actually needed. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel