Re: [PATCH] iommu/amd: Fix legacy interrupt remapping for x2APIC-enabled system

2020-04-25 Thread Suravee Suthikulpanit

Ping.

Thanks,
Suravee

On 4/22/20 8:30 PM, Suravee Suthikulpanit wrote:

Currently, system fails to boot because the legacy interrupt remapping
mode does not enable 128-bit IRTE (GA), which is required for x2APIC
support.

Fix by using AMD_IOMMU_GUEST_IR_LEGACY_GA mode when booting with
kernel option amd_iommu_intr=legacy instead. The initialization
logic will check GASup and automatically fallback to using
AMD_IOMMU_GUEST_IR_LEGACY if GA mode is not supported.

Fixes: 3928aa3f5775 ("iommu/amd: Detect and enable guest vAPIC support")
Signed-off-by: Suravee Suthikulpanit 
---
  drivers/iommu/amd_iommu_init.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 6be3853..2b9a67e 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2936,7 +2936,7 @@ static int __init parse_amd_iommu_intr(char *str)
  {
for (; *str; ++str) {
if (strncmp(str, "legacy", 6) == 0) {
-   amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY;
+   amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA;
break;
}
if (strncmp(str, "vapic", 5) == 0) {


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: iommu_iova slab eats too much memory

2020-04-25 Thread Bin
Dear John:
Thank you for your reply. The case you mentioned is a typical
performance regression issue, there's no need for the kernel to oom kill
any random process even in the worst case. But in my observations, the
iommu_iova slab could consume up to 40G memory, and the kernel have to kill
my vm process to free memory (64G memory installed). So I don't think it's
relevent.


John Garry  于2020年4月25日周六 上午1:50写道:

> On 24/04/2020 17:30, Robin Murphy wrote:
> > On 2020-04-24 2:20 pm, Bin wrote:
> >> Dear Robin:
> >>   Thank you for your explanation. Now, I understand that this could
> be
> >> NIC driver's fault, but how could I confirm it? Do I have to debug the
> >> driver myself?
> >
> > I'd start with CONFIG_DMA_API_DEBUG - of course it will chew through
> > memory about an order of magnitude faster than the IOVAs alone, but it
> > should shed some light on whether DMA API usage looks suspicious, and
> > dumping the mappings should help track down the responsible driver(s).
> > Although the debugfs code doesn't show the stacktrace of where each
> > mapping was made, I guess it would be fairly simple to tweak that for a
> > quick way to narrow down where to start looking in an offending driver.
> >
> > Robin.
>
> Just mentioning this in case it's relevant - we found long term aging
> throughput test causes RB tree to grow very large (and would I assume
> eat lots of memory):
>
>
> https://lore.kernel.org/linux-iommu/20190815121104.29140-3-thunder.leiz...@huawei.com/
>
> John
>
> >
> >> Robin Murphy  于2020年4月24日周五 下午8:15写道:
> >>
> >>> On 2020-04-24 1:06 pm, Bin wrote:
>  I'm not familiar with the mmu stuff, so what you mean by "some driver
>  leaking DMA mappings", is it possible that some other kernel module
> like
>  KVM or NIC driver leads to the leaking problem instead of the iommu
> >>> module
>  itself?
> >>>
> >>> Yes - I doubt that intel-iommu itself is failing to free IOVAs when it
> >>> should, since I'd expect a lot of people to have noticed that. It's far
> >>> more likely that some driver is failing to call dma_unmap_* when it's
> >>> finished with a buffer - with the IOMMU disabled that would be a no-op
> >>> on x86 with a modern 64-bit-capable device, so such a latent bug could
> >>> have been easily overlooked.
> >>>
> >>> Robin.
> >>>
>  Bin  于 2020年4月24日周五 20:00写道:
> 
> > Well, that's the problem! I'm assuming the iommu kernel module is
> >>> leaking
> > memory. But I don't know why and how.
> >
> > Do you have any idea about it? Or any further information is needed?
> >
> > Robin Murphy  于 2020年4月24日周五 19:20写道:
> >
> >> On 2020-04-24 1:40 am, Bin wrote:
> >>> Hello? anyone there?
> >>>
> >>> Bin  于2020年4月23日周四 下午5:14写道:
> >>>
>  Forget to mention, I've already disabled the slab merge, so this
> is
> >> what
>  it is.
> 
>  Bin  于2020年4月23日周四 下午5:11写道:
> 
> > Hey, guys:
> >
> > I'm running a batch of CoreOS boxes, the lsb_release is:
> >
> > ```
> > # cat /etc/lsb-release
> > DISTRIB_ID="Container Linux by CoreOS"
> > DISTRIB_RELEASE=2303.3.0
> > DISTRIB_CODENAME="Rhyolite"
> > DISTRIB_DESCRIPTION="Container Linux by CoreOS 2303.3.0
> (Rhyolite)"
> > ```
> >
> > ```
> > # uname -a
> > Linux cloud-worker-25 4.19.86-coreos #1 SMP Mon Dec 2 20:13:38
> -00
> >> 2019
> > x86_64 Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz GenuineIntel
> >> GNU/Linux
> > ```
> > Recently, I found my vms constently being killed due to OOM, and
> >>> after
> > digging into the problem, I finally realized that the kernel is
> >> leaking
> > memory.
> >
> > Here's my slabinfo:
> >
> >  Active / Total Objects (% used): 83818306 / 84191607
> (99.6%)
> >  Active / Total Slabs (% used)  : 1336293 / 1336293
> (100.0%)
> >  Active / Total Caches (% used) : 152 / 217 (70.0%)
> >  Active / Total Size (% used)   : 5828768.08K /
> 5996848.72K
> >> (97.2%)
> >  Minimum / Average / Maximum Object : 0.01K / 0.07K / 23.25K
> >
> >   OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
> >
> > 80253888 80253888 100%0.06K 1253967   64   5015868K
> >>> iommu_iova
> >>
> >> Do you really have a peak demand of ~80 million simultaneous DMA
> >> buffers, or is some driver leaking DMA mappings?
> >>
> >> Robin.
> >>
> > 489472 489123  99%0.03K   3824  128 15296K kmalloc-32
> >
> > 297444 271112  91%0.19K   7082   42 56656K dentry
> >
> > 254400 252784  99%0.06K   3975   64 15900K
> >>> anon_vma_chain
> >
> > 222528  39255  17%0.50K   6954   32111264K
> kmalloc-512
>