On 12/16/19 2:07 PM, Chen, Yian wrote:


On 12/11/2019 11:46 AM, Barret Rhoden wrote:
RMRR entries describe memory regions that are DMA targets for devices
outside the kernel's control.

RMRR entries that fail the sanity check are pointing to regions of
memory that the firmware did not tell the kernel are reserved or
otherwise should not be used.

Instead of aborting DMAR processing, this commit skips these RMRR
entries.  They will not be mapped into the IOMMU, but the IOMMU can
still be utilized.  If anything, when the IOMMU is on, those devices
will not be able to clobber RAM that the kernel has allocated from those
regions.

Signed-off-by: Barret Rhoden <b...@google.com>
---
  drivers/iommu/intel-iommu.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index f168cd8ee570..f7e09244c9e4 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4316,7 +4316,7 @@ int __init dmar_parse_one_rmrr(struct acpi_dmar_header *header, void *arg)
      rmrr = (struct acpi_dmar_reserved_memory *)header;
      ret = arch_rmrr_sanity_check(rmrr);
      if (ret)
-        return ret;
+        return 0;
      rmrru = kzalloc(sizeof(*rmrru), GFP_KERNEL);
      if (!rmrru)
Parsing rmrr function should report the error to caller. The behavior to response the error can be chose  by the caller in the calling stack, for example, dmar_walk_remapping_entries(). A concern is that ignoring a detected firmware bug might have a potential side impact though
it seemed safe for your case.

That's a little difficult given the current code.  Once we are in
dmar_walk_remapping_entries(), the specific function (parse_one_rmrr) is called via callback:

        ret = cb->cb[iter->type](iter, cb->arg[iter->type]);
        if (ret)
                return ret;

If there's an error of any sort, it aborts the walk. Handling the specific errors here is difficult, since we don't know what the errors mean to the specific callback. Is there some errno we can use that means "there was a problem, but it's not so bad that you have to abort, but I figured you ought to know"? Not that I think that's a good idea.

The knowledge of whether or not a specific error is worth aborting all DMAR functionality is best known inside the specific callback. The only handling to do is print a warning and either skip it or abort.

I think skipping the entry for a bad RMRR is better than aborting completely, though I understand if people don't like that. It's debatable. By aborting, we lose the ability to use the IOMMU at all, but we are still in a situation where the devices using the RMRR regions might be clobbering kernel memory, right? Using the IOMMU (with no mappings for the bad RMRRs) would stop those devices from clobbering memory.

Regardless, I have two other patches in this series that could resolve the problem for me and probably other people. I'd just like at least one of the three patches to get merged so that my machine boots when the original commit f036c7fa0ab6 ("iommu/vt-d: Check VT-d RMRR region in BIOS is reported as reserved") gets released.

Thanks,

Barret

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to