This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:
apport-collect 2023143 and then change the status of the bug to 'Confirmed'. If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'. This change has been made by an automated script, maintained by the Ubuntu Kernel Team. ** Changed in: linux (Ubuntu) Status: New => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2023143 Title: Memory leak on large server Status in linux package in Ubuntu: Incomplete Bug description: Hi, We are trying to diagnose a kernel memory look on a production Ubuntu 22.04.2 LTS. We have tried several official Ubuntu kernels, 5.15aws, 5.19aws and now even 6.2.0-1004-aws (all Ubuntu signed): ``` # cat /proc/version_signature Ubuntu 6.2.0-1004.4-aws 6.2.6 ``` This is a production server so we'll appreciate any and all help diagnosing and solving this issue! The server is an u-112 instance with 12TB RAM, and is losing 1TB+ of memory a day to a kernel leak. For example, currently with an uptime of 3.5 days, we have 1.8Ti available, however RSS+slabs is only 4.1TB. all active process together take about 4TB of RAM (`ps -eo rss | awk 'BEGIN {x=0} {x = x + $1} END {print x}'` gives 4088636708). From slabtop we see about 100GB are consumed by slab (`slabtop -o -s t | head`: ) ``` Active / Total Objects (% used) : 303580174 / 332642344 (91.3%) Active / Total Slabs (% used) : 6697552 / 6697552 (100.0%) Active / Total Caches (% used) : 158 / 215 (73.5%) Active / Total Size (% used) : 112801663.93K / 121442845.45K (92.9%) Minimum / Average / Maximum Object : 0.01K / 0.36K / 16.00K OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME 67537280 59696907 88% 0.03K 527635 128 2110540K kmalloc-32 65247564 65241398 99% 0.31K 1279364 51 20469824K arc_buf_hdr_t_full 58270446 58040685 99% 0.10K 747057 78 5976456K abd_t 16697268 13731405 82% 0.38K 397554 42 6360864K dmu_buf_impl_t 15982912 10366686 64% 0.50K 249733 64 7991456K kmalloc-512 14975616 11605380 77% 0.06K 233994 64 935976K kmalloc-64 ``` In /proc/meminfo: ``` MemTotal: 12656421408 kB MemFree: 1975976204 kB MemAvailable: 1968415088 kB Buffers: 1087956 kB Cached: 101168004 kB SwapCached: 17912340 kB Active: 101022084 kB Inactive: 4129984264 kB Active(anon): 94623216 kB Inactive(anon): 4104673512 kB Active(file): 6398868 kB Inactive(file): 25310752 kB Unevictable: 338908 kB Mlocked: 332132 kB SwapTotal: 4294967292 kB SwapFree: 3500705532 kB Zswap: 0 kB Zswapped: 0 kB Dirty: 2908 kB Writeback: 0 kB AnonPages: 4123489132 kB Mapped: 3761620 kB Shmem: 70756156 kB KReclaimable: 10319220 kB Slab: 122355620 kB SReclaimable: 10319220 kB SUnreclaim: 112036400 kB KernelStack: 1793296 kB PageTables: 21748556 kB SecPageTables: 0 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 10623177996 kB Committed_AS: 6775476544 kB VmallocTotal: 34359738367 kB VmallocUsed: 296984480 kB VmallocChunk: 0 kB Percpu: 1326080 kB HardwareCorrupted: 0 kB AnonHugePages: 1630980096 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB FileHugePages: 0 kB FilePmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 2056036 kB DirectMap2M: 40935424 kB DirectMap1G: 12814647296 kB ``` Its not a tmpfs/shm fs issue either: ``` df -h | grep -E 'tmpfs|shm' tmpfs 256G 70G 187G 27% /dev/shm tmpfs 256G 3.4M 256G 1% /run tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 8.0G 24K 8.0G 1% /run/user/10102 tmpfs 8.0G 24K 8.0G 1% /run/user/1002 tmpfs 8.0G 24K 8.0G 1% /run/user/10030 tmpfs 8.0G 24K 8.0G 1% /run/user/10194 tmpfs 8.0G 24K 8.0G 1% /run/user/10200 tmpfs 8.0G 24K 8.0G 1% /run/user/10136 tmpfs 8.0G 24K 8.0G 1% /run/user/10198 tmpfs 8.0G 24K 8.0G 1% /run/user/10143 tmpfs 8.0G 24K 8.0G 1% /run/user/10188 tmpfs 8.0G 24K 8.0G 1% /run/user/10124 tmpfs 8.0G 24K 8.0G 1% /run/user/10174 tmpfs 8.0G 24K 8.0G 1% /run/user/10165 tmpfs 8.0G 24K 8.0G 1% /run/user/10197 tmpfs 8.0G 24K 8.0G 1% /run/user/10183 tmpfs 8.0G 24K 8.0G 1% /run/user/10033 tmpfs 8.0G 24K 8.0G 1% /run/user/10023 tmpfs 8.0G 24K 8.0G 1% /run/user/10133 tmpfs 8.0G 24K 8.0G 1% /run/user/10185 tmpfs 8.0G 24K 8.0G 1% /run/user/10201 tmpfs 8.0G 24K 8.0G 1% /run/user/1004 tmpfs 8.0G 24K 8.0G 1% /run/user/10014 ``` --- ProblemType: Bug AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access '/dev/snd/': No such file or directory AplayDevices: Error: [Errno 2] No such file or directory: 'aplay' ApportVersion: 2.20.11-0ubuntu82.5 Architecture: amd64 ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord' CRDA: N/A CasperMD5CheckResult: unknown DistroRelease: Ubuntu 22.04 Ec2AMI: ami-08c40ec9ead489470 Ec2AMIManifest: (unknown) Ec2AvailabilityZone: us-east-1d Ec2InstanceType: u-12tb1.112xlarge Ec2Kernel: unavailable Ec2Ramdisk: unavailable IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig' Lspci: Error: [Errno 2] No such file or directory: 'lspci' Lspci-vt: Error: [Errno 2] No such file or directory: 'lspci' Lsusb: Error: command ['lsusb'] failed with exit code 1: Lsusb-t: Lsusb-v: Error: command ['lsusb', '-v'] failed with exit code 1: MachineType: Amazon EC2 u-12tb1.112xlarge NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair Package: linux (not installed) PciMultimedia: ProcEnviron: LC_CTYPE=C.UTF-8 TERM=xterm-256color PATH=(custom, no user) LANG=C.UTF-8 SHELL=/bin/bash ProcFB: ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.2.0-1004-aws root=PARTUUID=cbb5015f-ca94-467b-91ae-cce97828a042 ro quiet mitigations=off console=tty1 console=ttyS0 nvme_core.io_timeout=4294967295 panic=-1 ProcVersionSignature: Ubuntu 6.2.0-1004.4-aws 6.2.6 RelatedPackageVersions: linux-restricted-modules-6.2.0-1004-aws N/A linux-backports-modules-6.2.0-1004-aws N/A linux-firmware N/A RfKill: Error: [Errno 2] No such file or directory: 'rfkill' Tags: jammy ec2-images Uname: Linux 6.2.0-1004-aws x86_64 UnreportableReason: This report is about a package that is not installed. UpgradeStatus: No upgrade log present (probably fresh install) UserGroups: N/A _MarkForUpload: False dmi.bios.date: 10/16/2017 dmi.bios.release: 1.0 dmi.bios.vendor: Amazon EC2 dmi.bios.version: 1.0 dmi.board.asset.tag: i-0b8914fe51e3d7555 dmi.board.vendor: Amazon EC2 dmi.chassis.asset.tag: Amazon EC2 dmi.chassis.type: 1 dmi.chassis.vendor: Amazon EC2 dmi.modalias: dmi:bvnAmazonEC2:bvr1.0:bd10/16/2017:br1.0:svnAmazonEC2:pnu-12tb1.112xlarge:pvr:rvnAmazonEC2:rn:rvr:cvnAmazonEC2:ct1:cvr:sku: dmi.product.name: u-12tb1.112xlarge dmi.sys.vendor: Amazon EC2 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2023143/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp