Re: [PATCH 1/1] x86/vtd: Mask DMAR faults for IGD devices
On 17.04.2020 15:36, Brendan Kerrigan wrote: > The Intel graphics device records DMAR faults regardless > of the Fault Process Disable bit being set. I can't seem to be able to find a place where we would set FPD. Why do we need the workaround then, or what am I missing? > When this fault > occurs, enable the Interrupt Mask (IM) bit in the Fault > Event Control (FECTL) register to prevent the continued > recording of the fault. This should mention the underlying erratum in one way or another. > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -41,6 +41,8 @@ > #include "vtd.h" > #include "../ats.h" > > +#define IS_IGD(seg, id) (0 == seg && 0 == PCI_BUS(id) && 2 == PCI_SLOT(id) > && 0 == PCI_FUNC(id)) > + > struct mapped_rmrr { > struct list_head list; > u64 base, end; > @@ -872,6 +874,13 @@ static int iommu_page_fault_do_one(struct vtd_iommu > *iommu, int type, > printk(XENLOG_G_WARNING VTDPREFIX "%s: reason %02x - %s\n", > kind, fault_reason, reason); > > +if ( DMA_REMAP == fault_type && type && IS_IGD(seg, source_id) ) { Using existing infrastructure would be at least some improvement, in particular to avoid this workaround triggering on unaffected hardware (including such where there's something else at :00:02.0). See e.g. is_igd_drhd(), and note that most uses of it are currently further qualified to limit what hardware quirk workarounds get applied to. In any event you'll want to clarify in the description that this won't ever be done on hardware not having this issue, at least as long as this isn't obvious from the code. Also - why the "type" part of the conditional? The erratum text doesn't talk about only DMA reads being affected. Finally, style-wise the brace belongs on its own line. > +u32 fectl = dmar_readl(iommu->reg, DMAR_FECTL_REG); > +fectl |= DMA_FECTL_IM; While this is the recommended workaround, I don't see you undoing the masking anywhere later, and I'm also unconvinced this should be done upon seeing the first fault. One option might be to do so on the the first fault when FPD is set for the offending device. Or else, following the title, it might want/need doing unconditionally at initialization time. > +dmar_writel(iommu->reg, DMAR_FECTL_REG, fectl); > +printk(XENLOG_G_WARNING VTDPREFIX "Disabling DMAR faults for IGD\n"); If, as suggested above, IM will get cleared again at some point, this printk() should imo not be issued more than once. Jan
Re: [PATCH 1/1] x86/vtd: Mask DMAR faults for IGD devices
While it's described as errata for gen8/9, the previous reporting was from 2015 which predates those generations. I tested it on a gen 7 box (which was causing me the grief of consuming my Xen console buffer). It could be the case that the FPD code isn't implemented (which wouldn't matter for gen8/9 because it is broken), and the original problem of faulty firmware reporting bad ranges is the ultimate culprit. As far as the last two questions, I was testing on an older version of Xen (4.9.x) and ported it to master. Happy to hear a better approach to solving the original problem. -Brendan On Sun, Apr 19, 2020 at 11:28 PM Tian, Kevin wrote: > > From: Brendan Kerrigan > > Sent: Friday, April 17, 2020 9:36 PM > > > > From: Brendan Kerrigan > > > > The Intel graphics device records DMAR faults regardless > > of the Fault Process Disable bit being set. When this fault > > Since this is an errata for specific generations, let's describe > this way instead of stating it as a generic problem. > > > occurs, enable the Interrupt Mask (IM) bit in the Fault > > Event Control (FECTL) register to prevent the continued > > recording of the fault. > > > > Signed-off-by: Brendan Kerrigan > > --- > > xen/drivers/passthrough/vtd/iommu.c | 9 + > > 1 file changed, 9 insertions(+) > > > > diff --git a/xen/drivers/passthrough/vtd/iommu.c > > b/xen/drivers/passthrough/vtd/iommu.c > > index 07d40b37fe..288399d816 100644 > > --- a/xen/drivers/passthrough/vtd/iommu.c > > +++ b/xen/drivers/passthrough/vtd/iommu.c > > @@ -41,6 +41,8 @@ > > #include "vtd.h" > > #include "../ats.h" > > > > +#define IS_IGD(seg, id) (0 == seg && 0 == PCI_BUS(id) && 2 == > PCI_SLOT(id) > > && 0 == PCI_FUNC(id)) > > + > > struct mapped_rmrr { > > struct list_head list; > > u64 base, end; > > @@ -872,6 +874,13 @@ static int iommu_page_fault_do_one(struct > > vtd_iommu *iommu, int type, > > printk(XENLOG_G_WARNING VTDPREFIX "%s: reason %02x - %s\n", > > kind, fault_reason, reason); > > > > +if ( DMA_REMAP == fault_type && type && IS_IGD(seg, source_id) ) { > > +u32 fectl = dmar_readl(iommu->reg, DMAR_FECTL_REG); > > +fectl |= DMA_FECTL_IM; > > +dmar_writel(iommu->reg, DMAR_FECTL_REG, fectl); > > +printk(XENLOG_G_WARNING VTDPREFIX "Disabling DMAR faults for > > IGD\n"); > > +} > > + > > Several questions. First, I just note that FPD is not touched by any code > today. then how is this errata being caught? Second, we already have > pci_check_disable_device in place which stops DMAs from the problematic > device if it triggers excessive fault reports. why doesn't it work for your > case? Last, dma_msi_end just forces clearing the mask bit at end of > handling > the fault interrupt, making above fix meaningless. Those questions just > make me wonder how you actually test this patch and whether it's > necessary... > > Thanks > Kevin > > > if ( iommu_verbose && fault_type == DMA_REMAP ) > > print_vtd_entries(iommu, PCI_BUS(source_id), > PCI_DEVFN2(source_id), > >addr >> PAGE_SHIFT); > > -- > > 2.17.1 > >
RE: [PATCH 1/1] x86/vtd: Mask DMAR faults for IGD devices
> From: Brendan Kerrigan > Sent: Friday, April 17, 2020 9:36 PM > > From: Brendan Kerrigan > > The Intel graphics device records DMAR faults regardless > of the Fault Process Disable bit being set. When this fault Since this is an errata for specific generations, let's describe this way instead of stating it as a generic problem. > occurs, enable the Interrupt Mask (IM) bit in the Fault > Event Control (FECTL) register to prevent the continued > recording of the fault. > > Signed-off-by: Brendan Kerrigan > --- > xen/drivers/passthrough/vtd/iommu.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index 07d40b37fe..288399d816 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -41,6 +41,8 @@ > #include "vtd.h" > #include "../ats.h" > > +#define IS_IGD(seg, id) (0 == seg && 0 == PCI_BUS(id) && 2 == PCI_SLOT(id) > && 0 == PCI_FUNC(id)) > + > struct mapped_rmrr { > struct list_head list; > u64 base, end; > @@ -872,6 +874,13 @@ static int iommu_page_fault_do_one(struct > vtd_iommu *iommu, int type, > printk(XENLOG_G_WARNING VTDPREFIX "%s: reason %02x - %s\n", > kind, fault_reason, reason); > > +if ( DMA_REMAP == fault_type && type && IS_IGD(seg, source_id) ) { > +u32 fectl = dmar_readl(iommu->reg, DMAR_FECTL_REG); > +fectl |= DMA_FECTL_IM; > +dmar_writel(iommu->reg, DMAR_FECTL_REG, fectl); > +printk(XENLOG_G_WARNING VTDPREFIX "Disabling DMAR faults for > IGD\n"); > +} > + Several questions. First, I just note that FPD is not touched by any code today. then how is this errata being caught? Second, we already have pci_check_disable_device in place which stops DMAs from the problematic device if it triggers excessive fault reports. why doesn't it work for your case? Last, dma_msi_end just forces clearing the mask bit at end of handling the fault interrupt, making above fix meaningless. Those questions just make me wonder how you actually test this patch and whether it's necessary... Thanks Kevin > if ( iommu_verbose && fault_type == DMA_REMAP ) > print_vtd_entries(iommu, PCI_BUS(source_id), PCI_DEVFN2(source_id), >addr >> PAGE_SHIFT); > -- > 2.17.1
[PATCH 1/1] x86/vtd: Mask DMAR faults for IGD devices
From: Brendan Kerrigan The Intel graphics device records DMAR faults regardless of the Fault Process Disable bit being set. When this fault occurs, enable the Interrupt Mask (IM) bit in the Fault Event Control (FECTL) register to prevent the continued recording of the fault. Signed-off-by: Brendan Kerrigan --- xen/drivers/passthrough/vtd/iommu.c | 9 + 1 file changed, 9 insertions(+) diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index 07d40b37fe..288399d816 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -41,6 +41,8 @@ #include "vtd.h" #include "../ats.h" +#define IS_IGD(seg, id) (0 == seg && 0 == PCI_BUS(id) && 2 == PCI_SLOT(id) && 0 == PCI_FUNC(id)) + struct mapped_rmrr { struct list_head list; u64 base, end; @@ -872,6 +874,13 @@ static int iommu_page_fault_do_one(struct vtd_iommu *iommu, int type, printk(XENLOG_G_WARNING VTDPREFIX "%s: reason %02x - %s\n", kind, fault_reason, reason); +if ( DMA_REMAP == fault_type && type && IS_IGD(seg, source_id) ) { +u32 fectl = dmar_readl(iommu->reg, DMAR_FECTL_REG); +fectl |= DMA_FECTL_IM; +dmar_writel(iommu->reg, DMAR_FECTL_REG, fectl); +printk(XENLOG_G_WARNING VTDPREFIX "Disabling DMAR faults for IGD\n"); +} + if ( iommu_verbose && fault_type == DMA_REMAP ) print_vtd_entries(iommu, PCI_BUS(source_id), PCI_DEVFN2(source_id), addr >> PAGE_SHIFT); -- 2.17.1