Hi Guys, I sent this to fcoe-devel but it might be holiday season or the mailing list is abandoned as the emails concerning fcoe are pretty low.
On Mon, Jul 23, 2018 at 02:16:31PM +0200, ard wrote: Date: Mon, 23 Jul 2018 14:16:31 +0200 From: ard <a...@kwaak.net> Subject: FCOE vn2vn memory leaks in 4.14 To: fcoe-de...@open-fcoe.org Hi guys, After an upgrade of one of my systems from 3.10 to 4.14.55, I noticed a serious memory leak. As this kernel is not 100% vanilla, I started the bug report here: https://github.com/hardkernel/linux/issues/360 The essence is this: I have an FCoE interface assigned to a vlan on a nic. These were remnants of a test I did. The FCoE was still configured, but no targets were exported to that endpoint. So it would see and join multicast announcements of 2 other systems, but do nothing with it. This was good enoug to waste about 600MB of memory in 2 or 3 days. Some things have changed, maybe the amount of announcements (due to the heat I turn of systems), or really something in the kernel. But after 1 week I really have to pro-actively reboot the systeme in order to avoid OOM's. I've now disabled the the FCoE vlan on the port of that system, so it won't get any broadcasts. No memory leaks so far. The kmemleak is in that bug report, I won't mail it, since its 2.5MB. The gist seems to be: backtrace: [<bf3382ec>] fcoe_ctlr_vn_add+0x3c/0x1b4 [libfcoe] [<bf338c64>] fcoe_ctlr_vn_recv+0x800/0xb2c [libfcoe] [<bf33a400>] fcoe_ctlr_recv_work+0xb94/0x17f0 [libfcoe] [<c013dbb0>] process_one_work+0x138/0x4bc These seem to stand out: root@odroid5:~# grep -c fcoe_ctlr_vn_add kmemleak.txt;grep -c fcoe_fip_vlan_recv kmemleak.txt 1090 898 So there are 2 leaks: network skb leaks I presume and fcoe structure leaks. Except for one system that I turn off and on once a day, all other systems are stable running (older kernel though). The system I turnn of and on again also has some vn2vn problems and that's also a 4.14 kernel. (steam machine with steamos kernel, fcoe not actively used, but with a bcache on one of the targets, it probably auto registers a dependency) This is outside the scope of this ticket though. The system with the memory leak is a system intended to run 24/7. If anyone can point me to the right place, or help me... Regards, Ard van Breemen -- .signature not found