Forgot CC-ing Jerry, add him.

On 09/23/20 at 10:26am, Baoquan He wrote:
> A regression failure of kdump kernel boot was reported on a HPE system.
> Bisect points at commit 387caf0b759ac43 ("iommu/amd: Treat per-device
> exclusion ranges as r/w unity-mapped regions") as criminal. Reverting it
> fix the failure.
> 
> With the commit, kdump kernel will always print below error message, then
> naturally AMD iommu can't function normally during kdump kernel bootup.
> 
>   ~~~~~~~~~
>   AMD-Vi: [Firmware Bug]: IVRS invalid checksum
> 
> Why commit 387caf0b759ac43 causing it haven't been made clear.

Hi Joerg, Adrian

We only have one machine which can reproduce the issue, it's a gen10-01
of HPE. If any log or info are needed, please let me know, I can attach
here.

Thanks
Baoquan

> 
> From the commit log, a discussion thread link is pasted. In that discussion
> thread, Adrian told the fix is for a system with already broken BIOS, and
> Joerg suggested two options. Finally option 2) is taken. Maybe option 1)
> should be the right approach?
> 
>   1) Bail out and disable the IOMMU as the BIOS screwed up
>   2) Treat per-device exclusion ranges just as r/w unity-mapped
>      regions.
> 
> https://lists.linuxfoundation.org/pipermail/iommu/2019-November/040117.html
> Signed-off-by: Baoquan He <b...@redhat.com>
> ---
>  drivers/iommu/amd/init.c | 21 +++++++++++++--------
>  1 file changed, 13 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 9aa1eae26634..bbe7ceae5949 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -1109,17 +1109,22 @@ static int __init add_early_maps(void)
>   */
>  static void __init set_device_exclusion_range(u16 devid, struct ivmd_header 
> *m)
>  {
> +     struct amd_iommu *iommu = amd_iommu_rlookup_table[devid];
> +
>       if (!(m->flags & IVMD_FLAG_EXCL_RANGE))
>               return;
>  
> -     /*
> -      * Treat per-device exclusion ranges as r/w unity-mapped regions
> -      * since some buggy BIOSes might lead to the overwritten exclusion
> -      * range (exclusion_start and exclusion_length members). This
> -      * happens when there are multiple exclusion ranges (IVMD entries)
> -      * defined in ACPI table.
> -      */
> -     m->flags = (IVMD_FLAG_IW | IVMD_FLAG_IR | IVMD_FLAG_UNITY_MAP);
> +     if (iommu) {
> +             /*
> +              * We only can configure exclusion ranges per IOMMU, not
> +              * per device. But we can enable the exclusion range per
> +              * device. This is done here
> +              */
> +             set_dev_entry_bit(devid, DEV_ENTRY_EX);
> +             iommu->exclusion_start = m->range_start;
> +             iommu->exclusion_length = m->range_length;
> +     }
> +
>  }
>  
>  /*
> -- 
> 2.17.2
> 
> _______________________________________________
> iommu mailing list
> io...@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> 

Reply via email to