Dear Robin:
    Thank you for your explanation. Now, I understand that this could be
NIC driver's fault, but how could I confirm it? Do I have to debug the
driver myself?

Robin Murphy <robin.mur...@arm.com> 于2020年4月24日周五 下午8:15写道:

> On 2020-04-24 1:06 pm, Bin wrote:
> > I'm not familiar with the mmu stuff, so what you mean by "some driver
> > leaking DMA mappings", is it possible that some other kernel module like
> > KVM or NIC driver leads to the leaking problem instead of the iommu
> module
> > itself?
>
> Yes - I doubt that intel-iommu itself is failing to free IOVAs when it
> should, since I'd expect a lot of people to have noticed that. It's far
> more likely that some driver is failing to call dma_unmap_* when it's
> finished with a buffer - with the IOMMU disabled that would be a no-op
> on x86 with a modern 64-bit-capable device, so such a latent bug could
> have been easily overlooked.
>
> Robin.
>
> > Bin <anole1...@gmail.com> 于 2020年4月24日周五 20:00写道:
> >
> >> Well, that's the problem! I'm assuming the iommu kernel module is
> leaking
> >> memory. But I don't know why and how.
> >>
> >> Do you have any idea about it? Or any further information is needed?
> >>
> >> Robin Murphy <robin.mur...@arm.com> 于 2020年4月24日周五 19:20写道:
> >>
> >>> On 2020-04-24 1:40 am, Bin wrote:
> >>>> Hello? anyone there?
> >>>>
> >>>> Bin <anole1...@gmail.com> 于2020年4月23日周四 下午5:14写道:
> >>>>
> >>>>> Forget to mention, I've already disabled the slab merge, so this is
> >>> what
> >>>>> it is.
> >>>>>
> >>>>> Bin <anole1...@gmail.com> 于2020年4月23日周四 下午5:11写道:
> >>>>>
> >>>>>> Hey, guys:
> >>>>>>
> >>>>>> I'm running a batch of CoreOS boxes, the lsb_release is:
> >>>>>>
> >>>>>> ```
> >>>>>> # cat /etc/lsb-release
> >>>>>> DISTRIB_ID="Container Linux by CoreOS"
> >>>>>> DISTRIB_RELEASE=2303.3.0
> >>>>>> DISTRIB_CODENAME="Rhyolite"
> >>>>>> DISTRIB_DESCRIPTION="Container Linux by CoreOS 2303.3.0 (Rhyolite)"
> >>>>>> ```
> >>>>>>
> >>>>>> ```
> >>>>>> # uname -a
> >>>>>> Linux cloud-worker-25 4.19.86-coreos #1 SMP Mon Dec 2 20:13:38 -00
> >>> 2019
> >>>>>> x86_64 Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz GenuineIntel
> >>> GNU/Linux
> >>>>>> ```
> >>>>>> Recently, I found my vms constently being killed due to OOM, and
> after
> >>>>>> digging into the problem, I finally realized that the kernel is
> >>> leaking
> >>>>>> memory.
> >>>>>>
> >>>>>> Here's my slabinfo:
> >>>>>>
> >>>>>>    Active / Total Objects (% used)    : 83818306 / 84191607 (99.6%)
> >>>>>>    Active / Total Slabs (% used)      : 1336293 / 1336293 (100.0%)
> >>>>>>    Active / Total Caches (% used)     : 152 / 217 (70.0%)
> >>>>>>    Active / Total Size (% used)       : 5828768.08K / 5996848.72K
> >>> (97.2%)
> >>>>>>    Minimum / Average / Maximum Object : 0.01K / 0.07K / 23.25K
> >>>>>>
> >>>>>>     OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
> >>>>>>
> >>>>>> 80253888 80253888 100%    0.06K 1253967       64   5015868K
> iommu_iova
> >>>
> >>> Do you really have a peak demand of ~80 million simultaneous DMA
> >>> buffers, or is some driver leaking DMA mappings?
> >>>
> >>> Robin.
> >>>
> >>>>>> 489472 489123  99%    0.03K   3824      128     15296K kmalloc-32
> >>>>>>
> >>>>>> 297444 271112  91%    0.19K   7082       42     56656K dentry
> >>>>>>
> >>>>>> 254400 252784  99%    0.06K   3975       64     15900K
> anon_vma_chain
> >>>>>>
> >>>>>> 222528  39255  17%    0.50K   6954       32    111264K kmalloc-512
> >>>>>>
> >>>>>> 202482 201814  99%    0.19K   4821       42     38568K
> vm_area_struct
> >>>>>>
> >>>>>> 200192 200192 100%    0.01K    391      512      1564K kmalloc-8
> >>>>>>
> >>>>>> 170528 169359  99%    0.25K   5329       32     42632K filp
> >>>>>>
> >>>>>> 158144 153508  97%    0.06K   2471       64      9884K kmalloc-64
> >>>>>>
> >>>>>> 149914 149365  99%    0.09K   3259       46     13036K anon_vma
> >>>>>>
> >>>>>> 146640 143123  97%    0.10K   3760       39     15040K buffer_head
> >>>>>>
> >>>>>> 130368  32791  25%    0.09K   3104       42     12416K kmalloc-96
> >>>>>>
> >>>>>> 129752 129752 100%    0.07K   2317       56      9268K Acpi-Operand
> >>>>>>
> >>>>>> 105468 105106  99%    0.04K   1034      102      4136K
> >>>>>> selinux_inode_security
> >>>>>>    73080  73080 100%    0.13K   2436       30      9744K
> >>> kernfs_node_cache
> >>>>>>
> >>>>>>    72360  70261  97%    0.59K   1340       54     42880K inode_cache
> >>>>>>
> >>>>>>    71040  71040 100%    0.12K   2220       32      8880K
> eventpoll_epi
> >>>>>>
> >>>>>>    68096  59262  87%    0.02K    266      256      1064K kmalloc-16
> >>>>>>
> >>>>>>    53652  53652 100%    0.04K    526      102      2104K pde_opener
> >>>>>>
> >>>>>>    50496  31654  62%    2.00K   3156       16    100992K
> kmalloc-2048
> >>>>>>
> >>>>>>    46242  46242 100%    0.19K   1101       42      8808K cred_jar
> >>>>>>
> >>>>>>    44496  43013  96%    0.66K    927       48     29664K
> >>> proc_inode_cache
> >>>>>>
> >>>>>>    44352  44352 100%    0.06K    693       64      2772K
> >>> task_delay_info
> >>>>>>
> >>>>>>    43516  43471  99%    0.69K    946       46     30272K
> >>> sock_inode_cache
> >>>>>>
> >>>>>>    37856  27626  72%    1.00K   1183       32     37856K
> kmalloc-1024
> >>>>>>
> >>>>>>    36736  36736 100%    0.07K    656       56      2624K
> eventpoll_pwq
> >>>>>>
> >>>>>>    34076  31282  91%    0.57K   1217       28     19472K
> >>> radix_tree_node
> >>>>>>
> >>>>>>    33660  30528  90%    1.05K   1122       30     35904K
> >>> ext4_inode_cache
> >>>>>>
> >>>>>>    32760  30959  94%    0.19K    780       42      6240K kmalloc-192
> >>>>>>
> >>>>>>    32028  32028 100%    0.04K    314      102      1256K
> >>> ext4_extent_status
> >>>>>>
> >>>>>>    30048  30048 100%    0.25K    939       32      7512K
> >>> skbuff_head_cache
> >>>>>>
> >>>>>>    28736  28736 100%    0.06K    449       64      1796K fs_cache
> >>>>>>
> >>>>>>    24702  24702 100%    0.69K    537       46     17184K files_cache
> >>>>>>
> >>>>>>    23808  23808 100%    0.66K    496       48     15872K ovl_inode
> >>>>>>
> >>>>>>    23104  22945  99%    0.12K    722       32      2888K kmalloc-128
> >>>>>>
> >>>>>>    22724  21307  93%    0.69K    494       46     15808K
> >>> shmem_inode_cache
> >>>>>>
> >>>>>>    21472  21472 100%    0.12K    671       32      2684K seq_file
> >>>>>>
> >>>>>>    19904  19904 100%    1.00K    622       32     19904K UNIX
> >>>>>>
> >>>>>>    17340  17340 100%    1.06K    578       30     18496K mm_struct
> >>>>>>
> >>>>>>    15980  15980 100%    0.02K     94      170       376K avtab_node
> >>>>>>
> >>>>>>    14070  14070 100%    1.06K    469       30     15008K
> signal_cache
> >>>>>>
> >>>>>>    13248  13248 100%    0.12K    414       32      1656K pid
> >>>>>>
> >>>>>>    12128  11777  97%    0.25K    379       32      3032K kmalloc-256
> >>>>>>
> >>>>>>    11008  11008 100%    0.02K     43      256       172K
> >>>>>> selinux_file_security
> >>>>>>    10812  10812 100%    0.04K    106      102       424K
> Acpi-Namespace
> >>>>>>
> >>>>>> These information shows that the 'iommu_iova' is the top memory
> >>> consumer.
> >>>>>> In order to optimize the network performence of Openstack virtual
> >>> machines,
> >>>>>> I enabled the vt-d feature in bios and sriov feature of Intel 82599
> >>> 10G
> >>>>>> NIC. I'm assuming this is the root cause of this issue.
> >>>>>>
> >>>>>> Is there anything I can do to fix it?
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> iommu mailing list
> >>>> iommu@lists.linux-foundation.org
> >>>> https://lists.linuxfoundation.org/mailman/listinfo/iommu
> >>>>
> >>>
> >>
> >
>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to