Hi folks, I just noticed a whacky memory usage profile when running some basic IO tests on a current 4.8 tree. It looked like there was a massive memory leak from my monitoring graphs - doing buffered IO was causing huge amounts of memory to be considered used, but the cache size was not increasing.
Looking at /proc/meminfo: $ cat /proc/meminfo MemTotal: 16395408 kB MemFree: 79424 kB MemAvailable: 2497240 kB Buffers: 4372 kB Cached: 558744 kB SwapCached: 48 kB Active: 2127212 kB Inactive: 100400 kB Active(anon): 25348 kB Inactive(anon): 79424 kB Active(file): 2101864 kB Inactive(file): 20976 kB Unevictable: 13612980 kB <<<<<<<<< Mlocked: 3516 kB SwapTotal: 497976 kB SwapFree: 497188 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 38784 kB Mapped: 15880 kB Shmem: 8808 kB Slab: 460408 kB SReclaimable: 428496 kB SUnreclaim: 31912 kB KernelStack: 6112 kB PageTables: 6740 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 8695680 kB Committed_AS: 177456 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 14204 kB DirectMap2M: 16762880 kB It seems that whatever was happening was causing unevictable memory to pile up. But when I look at the per-node stats, the memory is all accounted as active file pages: $ cat /sys/bus/node/devices/node0/meminfo Node 0 MemTotal: 4029052 kB Node 0 MemFree: 33276 kB Node 0 MemUsed: 3995776 kB Node 0 Active: 3283280 kB Node 0 Inactive: 580668 kB Node 0 Active(anon): 3564 kB Node 0 Inactive(anon): 4716 kB Node 0 Active(file): 3279716 kB <<<<<<<< Node 0 Inactive(file): 575952 kB Node 0 Unevictable: 1648 kB Node 0 Mlocked: 1648 kB Node 0 Dirty: 8 kB Node 0 Writeback: 0 kB Node 0 FilePages: 78796 kB Node 0 Mapped: 5540 kB Node 0 AnonPages: 8020 kB Node 0 Shmem: 256 kB Node 0 KernelStack: 2352 kB Node 0 PageTables: 1976 kB Node 0 NFS_Unstable: 0 kB Node 0 Bounce: 0 kB Node 0 WritebackTmp: 0 kB Node 0 Slab: 109012 kB Node 0 SReclaimable: 99156 kB Node 0 SUnreclaim: 9856 kB Node 0 HugePages_Total: 0 Node 0 HugePages_Free: 0 Node 0 HugePages_Surp: 0 $ cat /sys/bus/node/devices/node1/meminfo Node 1 MemTotal: 4127912 kB Node 1 MemFree: 13888 kB Node 1 MemUsed: 4114024 kB Node 1 Active: 3455400 kB Node 1 Inactive: 522156 kB Node 1 Active(anon): 5556 kB Node 1 Inactive(anon): 6784 kB Node 1 Active(file): 3449844 kB Node 1 Inactive(file): 515372 kB Node 1 Unevictable: 52 kB Node 1 Mlocked: 52 kB Node 1 Dirty: 16 kB Node 1 Writeback: 0 kB Node 1 FilePages: 155684 kB Node 1 Mapped: 2216 kB Node 1 AnonPages: 12320 kB Node 1 Shmem: 16 kB Node 1 KernelStack: 720 kB Node 1 PageTables: 1120 kB Node 1 NFS_Unstable: 0 kB Node 1 Bounce: 0 kB Node 1 WritebackTmp: 0 kB Node 1 Slab: 117340 kB Node 1 SReclaimable: 111472 kB Node 1 SUnreclaim: 5868 kB Node 1 HugePages_Total: 0 Node 1 HugePages_Free: 0 Node 1 HugePages_Surp: 0 $ cat /sys/bus/node/devices/node2/meminfo Node 2 MemTotal: 4127912 kB Node 2 MemFree: 21308 kB Node 2 MemUsed: 4106604 kB Node 2 Active: 3453056 kB Node 2 Inactive: 517824 kB Node 2 Active(anon): 3224 kB Node 2 Inactive(anon): 4356 kB Node 2 Active(file): 3449832 kB Node 2 Inactive(file): 513468 kB Node 2 Unevictable: 556 kB Node 2 Mlocked: 556 kB Node 2 Dirty: 0 kB Node 2 Writeback: 0 kB Node 2 FilePages: 150120 kB Node 2 Mapped: 1840 kB Node 2 AnonPages: 7476 kB Node 2 Shmem: 232 kB Node 2 KernelStack: 1184 kB Node 2 PageTables: 1360 kB Node 2 NFS_Unstable: 0 kB Node 2 Bounce: 0 kB Node 2 WritebackTmp: 0 kB Node 2 Slab: 114288 kB Node 2 SReclaimable: 107616 kB Node 2 SUnreclaim: 6672 kB Node 2 HugePages_Total: 0 Node 2 HugePages_Free: 0 Node 2 HugePages_Surp: 0 $ cat /sys/bus/node/devices/node3/meminfo Node 3 MemTotal: 4110532 kB Node 3 MemFree: 10224 kB Node 3 MemUsed: 4100308 kB Node 3 Active: 3442224 kB Node 3 Inactive: 506564 kB Node 3 Active(anon): 8636 kB Node 3 Inactive(anon): 9492 kB Node 3 Active(file): 3433588 kB Node 3 Inactive(file): 497072 kB Node 3 Unevictable: 1260 kB Node 3 Mlocked: 1260 kB Node 3 Dirty: 0 kB Node 3 Writeback: 0 kB Node 3 FilePages: 178564 kB Node 3 Mapped: 6284 kB Node 3 AnonPages: 10968 kB Node 3 Shmem: 8304 kB Node 3 KernelStack: 1856 kB Node 3 PageTables: 2284 kB Node 3 NFS_Unstable: 0 kB Node 3 Bounce: 0 kB Node 3 WritebackTmp: 0 kB Node 3 Slab: 119736 kB Node 3 SReclaimable: 110252 kB Node 3 SUnreclaim: 9484 kB Node 3 HugePages_Total: 0 Node 3 HugePages_Free: 0 Node 3 HugePages_Surp: 0 So clearly there's an accounting problem here. I think there may be multiple problems, however. The workload is simple: $ time for i in `seq 0 1 100`; do > sudo rm -f /mnt/scratch/testfile > sudo xfs_io -f -c "pwrite 0 512m -b 32k" -c "pwrite 0 511m -b 32k" > /mnt/scratch/testfile &> /dev/null > done It's just writing 512MB to a file twice, then unlinking it. Then doing it again. On unlink, the page cache is invalidated, which means all the 512MB of cached pages should be freed and removed from teh page cache. According to the per-node counters, that is not happening and there gigabytes of invalidated pages still sitting on the active LRUs. Something is broken.... Cheers, Dave. -- Dave Chinner [email protected]

