Re: possible circular locking dependency detected
- Original Message - > From: "CAI Qian" > To: "Linus Torvalds" > Cc: "Al Viro" , "Miklos Szeredi" > , "Rainer Weikusat" > , "Hannes Frederic Sowa" > , "Rainer Weikusat" > , "Eric Sandeen" , > "Network Development" > > Sent: Friday, September 2, 2016 11:51:58 AM > Subject: Re: possible circular locking dependency detected > > > > - Original Message - > > From: "CAI Qian" > > To: "Linus Torvalds" > > Cc: "Al Viro" , "Miklos Szeredi" > > , "Rainer Weikusat" > > , "Hannes Frederic Sowa" > > , "Rainer Weikusat" > > , "Eric Sandeen" , > > "Network Development" > > > > Sent: Friday, September 2, 2016 10:43:20 AM > > Subject: Re: possible circular locking dependency detected > > > > > > > > - Original Message - > > > From: "Linus Torvalds" > > > To: "Al Viro" , "CAI Qian" > > > Cc: "Miklos Szeredi" , "Rainer Weikusat" > > > , "Hannes Frederic Sowa" > > > , "Rainer Weikusat" > > > , "Eric Sandeen" > > > , "Network Development" > > > Sent: Thursday, September 1, 2016 6:04:38 PM > > > Subject: Re: possible circular locking dependency detected > > > > > > On Thu, Sep 1, 2016 at 2:43 PM, Linus Torvalds > > > wrote: > > > > On Thu, Sep 1, 2016 at 2:01 PM, Al Viro > > > > wrote: > > > >> > > > >> Outside as in "all fs activity in bind happens under it". Along with > > > >> assignment to ->u.addr, etc. IOW, make it the outermost lock there. > > > > > > > > Hah, yes. I misunderstood you. > > > > > > > > Yes. In fact that fixes the problem I mentioned, rather than > > > > introducing > > > > it. > > > > > > So the easiest approach would seem to be to revert commit c845acb324aa > > > ("af_unix: Fix splice-bind deadlock"), and then apply the lock split. > > > > > > Like the attached two patches. > > > > > > This is still *entirely* untested. > > Tested-by: CAI Qian OK, this tag still stand. The below issue is also reproduced without those patches, so a separate problem most likely was introduced recently (after rc3 or rc4) by probably some xfs update. CAI Qian > Actually, I took it back, and now spice seems start to deadlock using the > reproducer, > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/splice/splice01.c > > [ 1749.956818] > [ 1749.958492] == > [ 1749.965386] [ INFO: possible circular locking dependency detected ] > [ 1749.972381] 4.8.0-rc4+ #34 Not tainted > [ 1749.976560] --- > [ 1749.983554] splice01/35921 is trying to acquire lock: > [ 1749.989188] (&sb->s_type->i_mutex_key#14){+.+.+.}, at: > [] xfs_file_buffered_aio_write+0x127/0x840 [xfs] > [ 1750.001644] > [ 1750.001644] but task is already holding lock: > [ 1750.008151] (&pipe->mutex/1){+.+.+.}, at: [] > pipe_lock+0x51/0x60 > [ 1750.016753] > [ 1750.016753] which lock already depends on the new lock. > [ 1750.016753] > [ 1750.025880] > [ 1750.025880] the existing dependency chain (in reverse order) is: > [ 1750.034229] > -> #2 (&pipe->mutex/1){+.+.+.}: > [ 1750.039139][] lock_acquire+0x1fa/0x440 > [ 1750.045857][] mutex_lock_nested+0xdd/0x850 > [ 1750.052963][] pipe_lock+0x51/0x60 > [ 1750.059190][] splice_to_pipe+0x75/0x9e0 > [ 1750.066001][] > __generic_file_splice_read+0xa71/0xe90 > [ 1750.074071][] > generic_file_splice_read+0xc1/0x1f0 > [ 1750.081849][] xfs_file_splice_read+0x368/0x7b0 > [xfs] > [ 1750.089940][] do_splice_to+0xee/0x150 > [ 1750.096555][] SyS_splice+0x1144/0x1c10 > [ 1750.103269][] do_syscall_64+0x1a6/0x500 > [ 1750.110084][] return_from_SYSCALL_64+0x0/0x7a > [ 1750.117479] > -> #1 (&(&ip->i_iolock)->mr_lock#2){++}: > [ 1750.123649][] lock_acquire+0x1fa/0x440 > [ 1750.130362][] down_write_nested+0x5e/0xe0 > [ 1750.137371][] xfs_ilock+0x2fe/0x550 [xfs] > [ 1750.144397][] > xfs_file_buffered_aio_write+0x134/0x840 [xfs] > [ 1750.153175][] xfs_file_write_iter+0x26d/0x6d0 > [xfs] > [ 1750.16
Re: possible circular locking dependency detected
- Original Message - > From: "CAI Qian" > To: "Linus Torvalds" > Cc: "Al Viro" , "Miklos Szeredi" > , "Rainer Weikusat" > , "Hannes Frederic Sowa" > , "Rainer Weikusat" > , "Eric Sandeen" , > "Network Development" > > Sent: Friday, September 2, 2016 10:43:20 AM > Subject: Re: possible circular locking dependency detected > > > > - Original Message - > > From: "Linus Torvalds" > > To: "Al Viro" , "CAI Qian" > > Cc: "Miklos Szeredi" , "Rainer Weikusat" > > , "Hannes Frederic Sowa" > > , "Rainer Weikusat" > > , "Eric Sandeen" > > , "Network Development" > > Sent: Thursday, September 1, 2016 6:04:38 PM > > Subject: Re: possible circular locking dependency detected > > > > On Thu, Sep 1, 2016 at 2:43 PM, Linus Torvalds > > wrote: > > > On Thu, Sep 1, 2016 at 2:01 PM, Al Viro wrote: > > >> > > >> Outside as in "all fs activity in bind happens under it". Along with > > >> assignment to ->u.addr, etc. IOW, make it the outermost lock there. > > > > > > Hah, yes. I misunderstood you. > > > > > > Yes. In fact that fixes the problem I mentioned, rather than introducing > > > it. > > > > So the easiest approach would seem to be to revert commit c845acb324aa > > ("af_unix: Fix splice-bind deadlock"), and then apply the lock split. > > > > Like the attached two patches. > > > > This is still *entirely* untested. > Tested-by: CAI Qian Actually, I took it back, and now spice seems start to deadlock using the reproducer, https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/splice/splice01.c [ 1749.956818] [ 1749.958492] == [ 1749.965386] [ INFO: possible circular locking dependency detected ] [ 1749.972381] 4.8.0-rc4+ #34 Not tainted [ 1749.976560] --- [ 1749.983554] splice01/35921 is trying to acquire lock: [ 1749.989188] (&sb->s_type->i_mutex_key#14){+.+.+.}, at: [] xfs_file_buffered_aio_write+0x127/0x840 [xfs] [ 1750.001644] [ 1750.001644] but task is already holding lock: [ 1750.008151] (&pipe->mutex/1){+.+.+.}, at: [] pipe_lock+0x51/0x60 [ 1750.016753] [ 1750.016753] which lock already depends on the new lock. [ 1750.016753] [ 1750.025880] [ 1750.025880] the existing dependency chain (in reverse order) is: [ 1750.034229] -> #2 (&pipe->mutex/1){+.+.+.}: [ 1750.039139][] lock_acquire+0x1fa/0x440 [ 1750.045857][] mutex_lock_nested+0xdd/0x850 [ 1750.052963][] pipe_lock+0x51/0x60 [ 1750.059190][] splice_to_pipe+0x75/0x9e0 [ 1750.066001][] __generic_file_splice_read+0xa71/0xe90 [ 1750.074071][] generic_file_splice_read+0xc1/0x1f0 [ 1750.081849][] xfs_file_splice_read+0x368/0x7b0 [xfs] [ 1750.089940][] do_splice_to+0xee/0x150 [ 1750.096555][] SyS_splice+0x1144/0x1c10 [ 1750.103269][] do_syscall_64+0x1a6/0x500 [ 1750.110084][] return_from_SYSCALL_64+0x0/0x7a [ 1750.117479] -> #1 (&(&ip->i_iolock)->mr_lock#2){++}: [ 1750.123649][] lock_acquire+0x1fa/0x440 [ 1750.130362][] down_write_nested+0x5e/0xe0 [ 1750.137371][] xfs_ilock+0x2fe/0x550 [xfs] [ 1750.144397][] xfs_file_buffered_aio_write+0x134/0x840 [xfs] [ 1750.153175][] xfs_file_write_iter+0x26d/0x6d0 [xfs] [ 1750.161177][] __vfs_write+0x2be/0x640 [ 1750.167799][] vfs_write+0x152/0x4b0 [ 1750.174220][] SyS_write+0xdf/0x1d0 [ 1750.180547][] entry_SYSCALL_64_fastpath+0x1f/0xbd [ 1750.188328] -> #0 (&sb->s_type->i_mutex_key#14){+.+.+.}: [ 1750.194508][] __lock_acquire+0x3043/0x3dd0 [ 1750.201609][] lock_acquire+0x1fa/0x440 [ 1750.208321][] down_write+0x5a/0xe0 [ 1750.214645][] xfs_file_buffered_aio_write+0x127/0x840 [xfs] [ 1750.223421][] xfs_file_write_iter+0x26d/0x6d0 [xfs] [ 1750.231423][] vfs_iter_write+0x29e/0x550 [ 1750.238330][] iter_file_splice_write+0x529/0xb70 [ 1750.246012][] SyS_splice+0x724/0x1c10 [ 1750.252627][] do_syscall_64+0x1a6/0x500 [ 1750.259438][] return_from_SYSCALL_64+0x0/0x7a [ 1750.266830] [ 1750.266830] other info that might help us debug this: [ 1750.266830] [ 1750.275764] Chain exists of: &sb->s_type->i_mutex_key#14 --> &(&ip->i_iolock)->mr_lock#2 --> &pipe->mutex/1 [ 1750.287213] Possible unsafe locking scenario: [ 1750.287213] [ 1750.293817]CPU0CPU
Re: possible circular locking dependency detected
- Original Message - > From: "Linus Torvalds" > To: "Al Viro" , "CAI Qian" > Cc: "Miklos Szeredi" , "Rainer Weikusat" > , "Hannes Frederic Sowa" > , "Rainer Weikusat" > , "Eric Sandeen" > , "Network Development" > Sent: Thursday, September 1, 2016 6:04:38 PM > Subject: Re: possible circular locking dependency detected > > On Thu, Sep 1, 2016 at 2:43 PM, Linus Torvalds > wrote: > > On Thu, Sep 1, 2016 at 2:01 PM, Al Viro wrote: > >> > >> Outside as in "all fs activity in bind happens under it". Along with > >> assignment to ->u.addr, etc. IOW, make it the outermost lock there. > > > > Hah, yes. I misunderstood you. > > > > Yes. In fact that fixes the problem I mentioned, rather than introducing > > it. > > So the easiest approach would seem to be to revert commit c845acb324aa > ("af_unix: Fix splice-bind deadlock"), and then apply the lock split. > > Like the attached two patches. > > This is still *entirely* untested. Tested-by: CAI Qian > > Rainer? > > Linus >
Re: possible circular locking dependency detected (bisected)
FYI, the regression is tracked here, https://bugzilla.kernel.org/show_bug.cgi?id=155781 CAI Qian - Original Message - > From: "Rainer Weikusat" > To: "CAI Qian" > Cc: "Rainer Weikusat" , > secur...@kernel.org, "Miklos Szeredi" > , "Eric Sandeen" , "Network > Development" > Sent: Wednesday, August 31, 2016 4:16:25 PM > Subject: Re: possible circular locking dependency detected (bisected) > > CAI Qian writes: > > Reverted the patch below fixes this problem. > > > > c845acb324aa85a39650a14e7696982ceea75dc1 > > af_unix: Fix splice-bind deadlock > > Reverting a patch fixing one deadlock in order to avoid another deadlock > leaves the 'net situation' unchanged. The idea of the other patch was to > change unix_mknod such that it doesn't do __sb_start_write with > u->readlock held anymore. As far as I understand the output below, > overlayfs introduce an additional codepath where unix_mknod end up doing > __sb_start_write again. That's already the original deadlock re-added, > cf, > > B: splice() from a pipe to /mnt/regular_file > does sb_start_write() on /mnt > C: try to freeze /mnt > wait for B to finish with /mnt > A: bind() try to bind our socket to /mnt/new_socket_name > lock our socket, see it not bound yet > decide that it needs to create something in /mnt > try to do sb_start_write() on /mnt, block (it's > waiting for C). > D: splice() from the same pipe to our socket > lock the pipe, see that socket is connected > try to lock the socket, block waiting for A > B: get around to actually feeding a chunk from > pipe to file, try to lock the pipe. Deadlock. > > > as A will again acquire the readlock and then call __sb_start_write. > > > > >CAI Qian > > > > - Original Message - > >> From: "CAI Qian" > >> To: secur...@kernel.org > >> Cc: "Miklos Szeredi" , "Eric Sandeen" > >> > >> Sent: Tuesday, August 30, 2016 5:05:45 PM > >> Subject: Re: possible circular locking dependency detected > >> > >> FYI, this one can only be reproduced using the overlayfs docker backend. > >> The device-mapper works fine. The XFS below has ftype=1. > >> > >> # cp recvmsg01 /mnt > >> # docker run -it -v /mnt/:/mnt/ rhel7 bash > >> [root@c33c99aedd93 /]# mount > >> overlay on / type overlay > >> (rw,relatime,seclabel,lowerdir=l/I5VXL74ENBNAEARZ4M2SIN3XD6:l/KZGBKPXLDXUGHYWMERFUBM4FRP,upperdir=9a7c1f735166b1f63d220b4b6c59cc37f3922719ef810c97182b814c1ab336df/diff,workdir=9a7c1f735166b1f63d220b4b6c59cc37f3922719ef810c97182b814c1ab336df/work) > >> ... > >> [root@c33c99aedd93 /]# /mnt/recvmsg01 > >> CAI Qian > >> > >> - Original Message - > >> > From: "CAI Qian" > >> > To: secur...@kernel.org > >> > Sent: Friday, August 26, 2016 10:50:57 AM > >> > Subject: possible circular locking dependency detected > >> > > >> > FYI, just want to give a head up to see if there is anything obvious so > >> > we can avoid a possible DoS somehow. > >> > > >> > Running the LTP syscalls tests inside a container until this test > >> > trigger > >> > below, > >> > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/recvmsg/recvmsg01.c > >> > > >> > [ 4441.904103] open04 (42409) used greatest stack depth: 20552 bytes > >> > left > >> > [ 4605.419167] > >> > [ 4605.420831] == > >> > [ 4605.427727] [ INFO: possible circular locking dependency detected ] > >> > [ 4605.434720] 4.8.0-rc3+ #3 Not tainted > >> > [ 4605.438803] --- > >> > [ 4605.445796] recvmsg01/42878 is trying to acquire lock: > >> > [ 4605.451528] (sb_writers#8){.+.+.+}, at: [] > >> > __sb_start_write+0xb4/0xf0 > >> > [ 4605.460642] > >> > [ 4605.460642] but task is already holding lock: > >> > [ 4605.467150] (&u->readlock){+.+.+.}, at: [] > >> > unix_bind+0x299/0xdf0 > >> > [ 4605.475749] > >> > [ 4605.475749] which lock already depends on the new lock. > >> > [ 4605.475749] > >> > [ 4605.484882] > >> > [ 4605.484882] the existing dependency chain (in
Re: possible circular locking dependency detected (bisected)
Reverted the patch below fixes this problem. c845acb324aa85a39650a14e7696982ceea75dc1 af_unix: Fix splice-bind deadlock CAI Qian - Original Message - > From: "CAI Qian" > To: secur...@kernel.org > Cc: "Miklos Szeredi" , "Eric Sandeen" > > Sent: Tuesday, August 30, 2016 5:05:45 PM > Subject: Re: possible circular locking dependency detected > > FYI, this one can only be reproduced using the overlayfs docker backend. > The device-mapper works fine. The XFS below has ftype=1. > > # cp recvmsg01 /mnt > # docker run -it -v /mnt/:/mnt/ rhel7 bash > [root@c33c99aedd93 /]# mount > overlay on / type overlay > (rw,relatime,seclabel,lowerdir=l/I5VXL74ENBNAEARZ4M2SIN3XD6:l/KZGBKPXLDXUGHYWMERFUBM4FRP,upperdir=9a7c1f735166b1f63d220b4b6c59cc37f3922719ef810c97182b814c1ab336df/diff,workdir=9a7c1f735166b1f63d220b4b6c59cc37f3922719ef810c97182b814c1ab336df/work) > ... > [root@c33c99aedd93 /]# /mnt/recvmsg01 > CAI Qian > > - Original Message - > > From: "CAI Qian" > > To: secur...@kernel.org > > Sent: Friday, August 26, 2016 10:50:57 AM > > Subject: possible circular locking dependency detected > > > > FYI, just want to give a head up to see if there is anything obvious so > > we can avoid a possible DoS somehow. > > > > Running the LTP syscalls tests inside a container until this test trigger > > below, > > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/recvmsg/recvmsg01.c > > > > [ 4441.904103] open04 (42409) used greatest stack depth: 20552 bytes left > > [ 4605.419167] > > [ 4605.420831] == > > [ 4605.427727] [ INFO: possible circular locking dependency detected ] > > [ 4605.434720] 4.8.0-rc3+ #3 Not tainted > > [ 4605.438803] --- > > [ 4605.445796] recvmsg01/42878 is trying to acquire lock: > > [ 4605.451528] (sb_writers#8){.+.+.+}, at: [] > > __sb_start_write+0xb4/0xf0 > > [ 4605.460642] > > [ 4605.460642] but task is already holding lock: > > [ 4605.467150] (&u->readlock){+.+.+.}, at: [] > > unix_bind+0x299/0xdf0 > > [ 4605.475749] > > [ 4605.475749] which lock already depends on the new lock. > > [ 4605.475749] > > [ 4605.484882] > > [ 4605.484882] the existing dependency chain (in reverse order) is: > > [ 4605.493234] > > [ 4605.493234] -> #2 (&u->readlock){+.+.+.}: > > [ 4605.497943][] lock_acquire+0x1fa/0x440 > > [ 4605.504659][] > > mutex_lock_interruptible_nested+0xdd/0x920 > > [ 4605.513119][] unix_bind+0x299/0xdf0 > > [ 4605.519540][] SYSC_bind+0x1d8/0x240 > > [ 4605.525964][] SyS_bind+0xe/0x10 > > [ 4605.531998][] do_syscall_64+0x1a6/0x500 > > [ 4605.538811][] return_from_SYSCALL_64+0x0/0x7a > > [ 4605.546203] > > [ 4605.546203] -> #1 (&type->i_mutex_dir_key#3/1){+.+.+.}: > > [ 4605.552292][] lock_acquire+0x1fa/0x440 > > [ 4605.559002][] down_write_nested+0x5e/0xe0 > > [ 4605.566008][] filename_create+0x155/0x470 > > [ 4605.573013][] SyS_mkdir+0xaf/0x1f0 > > [ 4605.579339][] > > entry_SYSCALL_64_fastpath+0x1f/0xbd > > [ 4605.587119] > > [ 4605.587119] -> #0 (sb_writers#8){.+.+.+}: > > [ 4605.591835][] __lock_acquire+0x3043/0x3dd0 > > [ 4605.598935][] lock_acquire+0x1fa/0x440 > > [ 4605.605646][] percpu_down_read+0x4f/0xa0 > > [ 4605.612552][] __sb_start_write+0xb4/0xf0 > > [ 4605.619459][] mnt_want_write+0x41/0xb0 > > [ 4605.626173][] ovl_want_write+0x76/0xa0 > > [overlay] > > [ 4605.633860][] ovl_create_object+0xa3/0x2d0 > > [overlay] > > [ 4605.641942][] ovl_mknod+0x31/0x40 [overlay] > > [ 4605.649138][] vfs_mknod+0x34b/0x560 > > [ 4605.655570][] unix_bind+0x4ca/0xdf0 > > [ 4605.661991][] SYSC_bind+0x1d8/0x240 > > [ 4605.668412][] SyS_bind+0xe/0x10 > > [ 4605.674456][] do_syscall_64+0x1a6/0x500 > > [ 4605.681266][] return_from_SYSCALL_64+0x0/0x7a > > [ 4605.688657] > > [ 4605.688657] other info that might help us debug this: > > [ 4605.688657] > > [ 4605.697590] Chain exists of: > > [ 4605.697590] sb_writers#8 --> &type->i_mutex_dir_key#3/1 --> > > &u->readlock > > [ 4605.697590] > > [ 4605.707287] Possible unsafe locking scenario: > > [ 4605.707287] > > [ 4605.713890]CPU0CPU1 > > [ 4605.718943]
Re: [PATCH net] rhashtable: fix a memory leak in alloc_bucket_locks()
After applied to this patch and ran the reproducer (compiling gcc), had no bucket_table_alloc in kmemleak report anymore. Hence, Tested-by: CAI Qian Funny enough, it now gave me this, [ 3406.807461] kmemleak: 1353 new suspected memory leaks (see /sys/kernel/debug/kmemleak) http://people.redhat.com/qcai/tmp/kmemleak.log CAI Qian - Original Message - > From: "Eric Dumazet" > To: "David Miller" > Cc: "CAI Qian" , "Thomas Graf" , "Herbert > Xu" , "Eric > Dumazet" , "Network Development" > , "Linus Torvalds" > , "Florian Westphal" > Sent: Friday, August 26, 2016 11:51:39 AM > Subject: [PATCH net] rhashtable: fix a memory leak in alloc_bucket_locks() > > From: Eric Dumazet > > If vmalloc() was successful, do not attempt a kmalloc_array() > > Fixes: 4cf0b354d92e ("rhashtable: avoid large lock-array allocations") > Reported-by: CAI Qian > Signed-off-by: Eric Dumazet > Cc: Florian Westphal > --- > lib/rhashtable.c |7 --- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/lib/rhashtable.c b/lib/rhashtable.c > index 5ba520b544d7..56054e541a0f 100644 > --- a/lib/rhashtable.c > +++ b/lib/rhashtable.c > @@ -77,17 +77,18 @@ static int alloc_bucket_locks(struct rhashtable *ht, > struct bucket_table *tbl, > size = min_t(unsigned int, size, tbl->size >> 1); > > if (sizeof(spinlock_t) != 0) { > + tbl->locks = NULL; > #ifdef CONFIG_NUMA > if (size * sizeof(spinlock_t) > PAGE_SIZE && > gfp == GFP_KERNEL) > tbl->locks = vmalloc(size * sizeof(spinlock_t)); > - else > #endif > if (gfp != GFP_KERNEL) > gfp |= __GFP_NOWARN | __GFP_NORETRY; > > - tbl->locks = kmalloc_array(size, sizeof(spinlock_t), > -gfp); > + if (!tbl->locks) > + tbl->locks = kmalloc_array(size, sizeof(spinlock_t), > +gfp); > if (!tbl->locks) > return -ENOMEM; > for (i = 0; i < size; i++) > > >
Re: possible memory leak in ipc
- Original Message - > From: "Linus Torvalds" > To: "CAI Qian" , "Thomas Graf" , "Herbert > Xu" > Cc: "Eric Dumazet" , "Network Development" > > Sent: Thursday, August 25, 2016 6:20:03 PM > Subject: Re: possible memory leak in ipc > > On Thu, Aug 25, 2016 at 1:17 PM, CAI Qian wrote: > > I am unsure if it is really a memleak (could be a security issue due to > > eventually OOM and DoS) or just a soft lockup with in kmemlock code with > > false alarm. > > Hmm. The reported leaks look like > > unreferenced object 0xc90004857000 (size 4608): > comm "kworker/16:0", pid 110, jiffies 4294705908 (age 883.925s) > hex dump (first 32 bytes): > c0 05 3d 5e 08 88 ff ff ff ff ff ff 00 00 dc 6e ..=^...n > ff ff ff ff ff ff ff ff 28 c7 46 83 ff ff ff ff (.F. > backtrace: > [] kmemleak_alloc+0x4a/0xa0 > [] __vmalloc_node_range+0x1de/0x2f0 > [] vmalloc+0x54/0x60 > [] alloc_bucket_locks.isra.7+0xd4/0xf0 > [] bucket_table_alloc+0x58/0x100 > [] rht_deferred_worker+0x10e/0x890 > [] process_one_work+0x218/0x750 > [] worker_thread+0x125/0x4a0 > [] kthread+0x101/0x120 > [] ret_from_fork+0x1f/0x40 > [] 0x > > which would indicate that it's a rhashtable resize event where we > perhaps haven't free'd the old hash table when we create a new one. > > The actually freeing of the old one is done RCU-deferred from > rhashtable_rehash_table(), but that itself is also deferred by a > worker thread (rht_deferred_worker). > > I'm not seeing anything wrong in the logic, but let's bring in Thomas > Graf and Herbert Xu. > > Hmm. The size (4608) is always the same and doesn't change, so maybe > it's not actually a rehash events per se - it's somebody creating a > rhashtable, but perhaps not freeing it? > > Sadly, all but one of the traces are that kthread one, and the one > that isn't that might give an idea about what code triggers this is: > > unreferenced object 0xc900048b6000 (size 4608): > comm "modprobe", pid 2485, jiffies 4294727633 (age 862.590s) > hex dump (first 32 bytes): > 00 9c 49 21 00 ea ff ff 00 d5 59 21 00 ea ff ff ..I!..Y! > 00 a5 7d 21 00 ea ff ff c0 da 74 21 00 ea ff ff ..}!..t! > backtrace: > [] kmemleak_alloc+0x4a/0xa0 > [] __vmalloc_node_range+0x1de/0x2f0 > [] vmalloc+0x54/0x60 > [] alloc_bucket_locks.isra.7+0xd4/0xf0 > [] bucket_table_alloc+0x58/0x100 > [] rhashtable_init+0x1ed/0x390 > [] 0xa05b201b > [] do_one_initcall+0x50/0x190 > [] do_init_module+0x60/0x1f3 > [] load_module+0x1487/0x1ca0 > [] SYSC_finit_module+0xa6/0xf0 > [] SyS_finit_module+0xe/0x10 > [] do_syscall_64+0x6c/0x1e0 > [] return_from_SYSCALL_64+0x0/0x7a > [] 0x > > so it comes from some module init code, but since the module hasn't > fully initialized, the kallsym code doesn't find the symbol name > either. Annoying. > > Maybe the above just makes one of the rhashtable people go "Oh, that's > obvious". > > Linus > FYI, this also happened while compiling gcc. $ make -j 56 $ cat /sys/kernel/debug/kmemleak unreferenced object 0xc9000485a000 (size 4608): comm "kworker/7:1", pid 368, jiffies 4294835499 (age 1033.075s) hex dump (first 32 bytes): 5f 65 76 65 6e 74 5f 69 74 65 6d 2e 68 00 05 00 _event_item.h... 00 76 6d 73 74 61 74 2e 68 00 05 00 00 70 6c 69 .vmstat.hpli backtrace: [] kmemleak_alloc+0x4a/0xa0 [] __vmalloc_node_range+0x378/0x700 [] vmalloc+0x54/0x60 [] alloc_bucket_locks.isra.7+0x188/0x220 [] bucket_table_alloc+0xac/0x290 [] rht_deferred_worker+0x8c5/0x1890 [] process_one_work+0x731/0x16c0 [] worker_thread+0xdc/0xf10 [] kthread+0x223/0x2f0 [] ret_from_fork+0x1f/0x40 [] 0x unreferenced object 0xc90004861000 (size 4608): comm "kworker/10:1", pid 380, jiffies 4294843418 (age 1025.169s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 backtrace: [] kmemleak_alloc+0x4a/0xa0 [] __vmalloc_node_range+0x378/0x700 [] vmalloc+0x54/0x60 [] alloc_bucket_locks.isra.7+0x188/0x220 [] bucket_table_alloc+0xac/0x290 [] rht_deferred_worker+0x15d4/0x1890 [] process_one_work+0x731/0x16c0 [] worker_thread+0xdc/0xf10 [] kthread+0x223/0x2f0 [] ret_from_fork+0x1f/0x40 [] 0x unreferenced object 0xc90004864000 (size 4608): com