Re: possible circular locking dependency detected

2016-09-02 Thread CAI Qian


- Original Message -
> From: "CAI Qian" 
> To: "Linus Torvalds" 
> Cc: "Al Viro" , "Miklos Szeredi" 
> , "Rainer Weikusat"
> , "Hannes Frederic Sowa" 
> , "Rainer Weikusat"
> , "Eric Sandeen" , 
> "Network Development"
> 
> Sent: Friday, September 2, 2016 11:51:58 AM
> Subject: Re: possible circular locking dependency detected
> 
> 
> 
> - Original Message -
> > From: "CAI Qian" 
> > To: "Linus Torvalds" 
> > Cc: "Al Viro" , "Miklos Szeredi"
> > , "Rainer Weikusat"
> > , "Hannes Frederic Sowa"
> > , "Rainer Weikusat"
> > , "Eric Sandeen" ,
> > "Network Development"
> > 
> > Sent: Friday, September 2, 2016 10:43:20 AM
> > Subject: Re: possible circular locking dependency detected
> > 
> > 
> > 
> > - Original Message -
> > > From: "Linus Torvalds" 
> > > To: "Al Viro" , "CAI Qian" 
> > > Cc: "Miklos Szeredi" , "Rainer Weikusat"
> > > , "Hannes Frederic Sowa"
> > > , "Rainer Weikusat"
> > > , "Eric Sandeen"
> > > , "Network Development" 
> > > Sent: Thursday, September 1, 2016 6:04:38 PM
> > > Subject: Re: possible circular locking dependency detected
> > > 
> > > On Thu, Sep 1, 2016 at 2:43 PM, Linus Torvalds
> > >  wrote:
> > > > On Thu, Sep 1, 2016 at 2:01 PM, Al Viro 
> > > > wrote:
> > > >>
> > > >> Outside as in "all fs activity in bind happens under it".  Along with
> > > >> assignment to ->u.addr, etc.  IOW, make it the outermost lock there.
> > > >
> > > > Hah, yes. I misunderstood you.
> > > >
> > > > Yes. In fact that fixes the problem I mentioned, rather than
> > > > introducing
> > > > it.
> > > 
> > > So the easiest approach would seem to be to revert commit c845acb324aa
> > > ("af_unix: Fix splice-bind deadlock"), and then apply the lock split.
> > > 
> > > Like the attached two patches.
> > > 
> > > This is still *entirely* untested.
> > Tested-by: CAI Qian 
OK, this tag still stand. The below issue is also reproduced without those
patches, so a separate problem most likely was introduced recently (after
rc3 or rc4) by probably some xfs update.
   CAI Qian 
> Actually, I took it back, and now spice seems start to deadlock using the
> reproducer,
> 
> https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/splice/splice01.c
> 
> [ 1749.956818]
> [ 1749.958492] ==
> [ 1749.965386] [ INFO: possible circular locking dependency detected ]
> [ 1749.972381] 4.8.0-rc4+ #34 Not tainted
> [ 1749.976560] ---
> [ 1749.983554] splice01/35921 is trying to acquire lock:
> [ 1749.989188]  (&sb->s_type->i_mutex_key#14){+.+.+.}, at:
> [] xfs_file_buffered_aio_write+0x127/0x840 [xfs]
> [ 1750.001644]
> [ 1750.001644] but task is already holding lock:
> [ 1750.008151]  (&pipe->mutex/1){+.+.+.}, at: []
> pipe_lock+0x51/0x60
> [ 1750.016753]
> [ 1750.016753] which lock already depends on the new lock.
> [ 1750.016753]
> [ 1750.025880]
> [ 1750.025880] the existing dependency chain (in reverse order) is:
> [ 1750.034229]
> -> #2 (&pipe->mutex/1){+.+.+.}:
> [ 1750.039139][] lock_acquire+0x1fa/0x440
> [ 1750.045857][] mutex_lock_nested+0xdd/0x850
> [ 1750.052963][] pipe_lock+0x51/0x60
> [ 1750.059190][] splice_to_pipe+0x75/0x9e0
> [ 1750.066001][]
> __generic_file_splice_read+0xa71/0xe90
> [ 1750.074071][]
> generic_file_splice_read+0xc1/0x1f0
> [ 1750.081849][] xfs_file_splice_read+0x368/0x7b0
> [xfs]
> [ 1750.089940][] do_splice_to+0xee/0x150
> [ 1750.096555][] SyS_splice+0x1144/0x1c10
> [ 1750.103269][] do_syscall_64+0x1a6/0x500
> [ 1750.110084][] return_from_SYSCALL_64+0x0/0x7a
> [ 1750.117479]
> -> #1 (&(&ip->i_iolock)->mr_lock#2){++}:
> [ 1750.123649][] lock_acquire+0x1fa/0x440
> [ 1750.130362][] down_write_nested+0x5e/0xe0
> [ 1750.137371][] xfs_ilock+0x2fe/0x550 [xfs]
> [ 1750.144397][]
> xfs_file_buffered_aio_write+0x134/0x840 [xfs]
> [ 1750.153175][] xfs_file_write_iter+0x26d/0x6d0
> [xfs]
> [ 1750.16

Re: possible circular locking dependency detected

2016-09-02 Thread CAI Qian


- Original Message -
> From: "CAI Qian" 
> To: "Linus Torvalds" 
> Cc: "Al Viro" , "Miklos Szeredi" 
> , "Rainer Weikusat"
> , "Hannes Frederic Sowa" 
> , "Rainer Weikusat"
> , "Eric Sandeen" , 
> "Network Development"
> 
> Sent: Friday, September 2, 2016 10:43:20 AM
> Subject: Re: possible circular locking dependency detected
> 
> 
> 
> - Original Message -
> > From: "Linus Torvalds" 
> > To: "Al Viro" , "CAI Qian" 
> > Cc: "Miklos Szeredi" , "Rainer Weikusat"
> > , "Hannes Frederic Sowa"
> > , "Rainer Weikusat"
> > , "Eric Sandeen"
> > , "Network Development" 
> > Sent: Thursday, September 1, 2016 6:04:38 PM
> > Subject: Re: possible circular locking dependency detected
> > 
> > On Thu, Sep 1, 2016 at 2:43 PM, Linus Torvalds
> >  wrote:
> > > On Thu, Sep 1, 2016 at 2:01 PM, Al Viro  wrote:
> > >>
> > >> Outside as in "all fs activity in bind happens under it".  Along with
> > >> assignment to ->u.addr, etc.  IOW, make it the outermost lock there.
> > >
> > > Hah, yes. I misunderstood you.
> > >
> > > Yes. In fact that fixes the problem I mentioned, rather than introducing
> > > it.
> > 
> > So the easiest approach would seem to be to revert commit c845acb324aa
> > ("af_unix: Fix splice-bind deadlock"), and then apply the lock split.
> > 
> > Like the attached two patches.
> > 
> > This is still *entirely* untested.
> Tested-by: CAI Qian 
Actually, I took it back, and now spice seems start to deadlock using the 
reproducer,

https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/splice/splice01.c

[ 1749.956818] 
[ 1749.958492] ==
[ 1749.965386] [ INFO: possible circular locking dependency detected ]
[ 1749.972381] 4.8.0-rc4+ #34 Not tainted
[ 1749.976560] ---
[ 1749.983554] splice01/35921 is trying to acquire lock:
[ 1749.989188]  (&sb->s_type->i_mutex_key#14){+.+.+.}, at: [] 
xfs_file_buffered_aio_write+0x127/0x840 [xfs]
[ 1750.001644] 
[ 1750.001644] but task is already holding lock:
[ 1750.008151]  (&pipe->mutex/1){+.+.+.}, at: [] 
pipe_lock+0x51/0x60
[ 1750.016753] 
[ 1750.016753] which lock already depends on the new lock.
[ 1750.016753] 
[ 1750.025880] 
[ 1750.025880] the existing dependency chain (in reverse order) is:
[ 1750.034229] 
-> #2 (&pipe->mutex/1){+.+.+.}:
[ 1750.039139][] lock_acquire+0x1fa/0x440
[ 1750.045857][] mutex_lock_nested+0xdd/0x850
[ 1750.052963][] pipe_lock+0x51/0x60
[ 1750.059190][] splice_to_pipe+0x75/0x9e0
[ 1750.066001][] 
__generic_file_splice_read+0xa71/0xe90
[ 1750.074071][] generic_file_splice_read+0xc1/0x1f0
[ 1750.081849][] xfs_file_splice_read+0x368/0x7b0 
[xfs]
[ 1750.089940][] do_splice_to+0xee/0x150
[ 1750.096555][] SyS_splice+0x1144/0x1c10
[ 1750.103269][] do_syscall_64+0x1a6/0x500
[ 1750.110084][] return_from_SYSCALL_64+0x0/0x7a
[ 1750.117479] 
-> #1 (&(&ip->i_iolock)->mr_lock#2){++}:
[ 1750.123649][] lock_acquire+0x1fa/0x440
[ 1750.130362][] down_write_nested+0x5e/0xe0
[ 1750.137371][] xfs_ilock+0x2fe/0x550 [xfs]
[ 1750.144397][] 
xfs_file_buffered_aio_write+0x134/0x840 [xfs]
[ 1750.153175][] xfs_file_write_iter+0x26d/0x6d0 [xfs]
[ 1750.161177][] __vfs_write+0x2be/0x640
[ 1750.167799][] vfs_write+0x152/0x4b0
[ 1750.174220][] SyS_write+0xdf/0x1d0
[ 1750.180547][] entry_SYSCALL_64_fastpath+0x1f/0xbd
[ 1750.188328] 
-> #0 (&sb->s_type->i_mutex_key#14){+.+.+.}:
[ 1750.194508][] __lock_acquire+0x3043/0x3dd0
[ 1750.201609][] lock_acquire+0x1fa/0x440
[ 1750.208321][] down_write+0x5a/0xe0
[ 1750.214645][] 
xfs_file_buffered_aio_write+0x127/0x840 [xfs]
[ 1750.223421][] xfs_file_write_iter+0x26d/0x6d0 [xfs]
[ 1750.231423][] vfs_iter_write+0x29e/0x550
[ 1750.238330][] iter_file_splice_write+0x529/0xb70
[ 1750.246012][] SyS_splice+0x724/0x1c10
[ 1750.252627][] do_syscall_64+0x1a6/0x500
[ 1750.259438][] return_from_SYSCALL_64+0x0/0x7a
[ 1750.266830] 
[ 1750.266830] other info that might help us debug this:
[ 1750.266830] 
[ 1750.275764] Chain exists of:
  &sb->s_type->i_mutex_key#14 --> &(&ip->i_iolock)->mr_lock#2 --> &pipe->mutex/1

[ 1750.287213]  Possible unsafe locking scenario:
[ 1750.287213] 
[ 1750.293817]CPU0CPU

Re: possible circular locking dependency detected

2016-09-02 Thread CAI Qian


- Original Message -
> From: "Linus Torvalds" 
> To: "Al Viro" , "CAI Qian" 
> Cc: "Miklos Szeredi" , "Rainer Weikusat" 
> , "Hannes Frederic Sowa"
> , "Rainer Weikusat" 
> , "Eric Sandeen"
> , "Network Development" 
> Sent: Thursday, September 1, 2016 6:04:38 PM
> Subject: Re: possible circular locking dependency detected
> 
> On Thu, Sep 1, 2016 at 2:43 PM, Linus Torvalds
>  wrote:
> > On Thu, Sep 1, 2016 at 2:01 PM, Al Viro  wrote:
> >>
> >> Outside as in "all fs activity in bind happens under it".  Along with
> >> assignment to ->u.addr, etc.  IOW, make it the outermost lock there.
> >
> > Hah, yes. I misunderstood you.
> >
> > Yes. In fact that fixes the problem I mentioned, rather than introducing
> > it.
> 
> So the easiest approach would seem to be to revert commit c845acb324aa
> ("af_unix: Fix splice-bind deadlock"), and then apply the lock split.
> 
> Like the attached two patches.
> 
> This is still *entirely* untested.
Tested-by: CAI Qian 
> 
> Rainer?
> 
>  Linus
> 


Re: possible circular locking dependency detected (bisected)

2016-09-01 Thread CAI Qian
FYI, the regression is tracked here,
https://bugzilla.kernel.org/show_bug.cgi?id=155781
   CAI Qian

- Original Message -
> From: "Rainer Weikusat" 
> To: "CAI Qian" 
> Cc: "Rainer Weikusat" , 
> secur...@kernel.org, "Miklos Szeredi"
> , "Eric Sandeen" , "Network 
> Development" 
> Sent: Wednesday, August 31, 2016 4:16:25 PM
> Subject: Re: possible circular locking dependency detected (bisected)
> 
> CAI Qian  writes:
> > Reverted the patch below fixes this problem.
> >
> > c845acb324aa85a39650a14e7696982ceea75dc1
> > af_unix: Fix splice-bind deadlock
> 
> Reverting a patch fixing one deadlock in order to avoid another deadlock
> leaves the 'net situation' unchanged. The idea of the other patch was to
> change unix_mknod such that it doesn't do __sb_start_write with
> u->readlock held anymore. As far as I understand the output below,
> overlayfs introduce an additional codepath where unix_mknod end up doing
> __sb_start_write again. That's already the original deadlock re-added,
> cf,
> 
> B: splice() from a pipe to /mnt/regular_file
> does sb_start_write() on /mnt
> C: try to freeze /mnt
> wait for B to finish with /mnt
> A: bind() try to bind our socket to /mnt/new_socket_name
> lock our socket, see it not bound yet
> decide that it needs to create something in /mnt
> try to do sb_start_write() on /mnt, block (it's
> waiting for C).
> D: splice() from the same pipe to our socket
> lock the pipe, see that socket is connected
> try to lock the socket, block waiting for A
>     B:  get around to actually feeding a chunk from
>         pipe to file, try to lock the pipe.  Deadlock.
> 
> 
> as A will again acquire the readlock and then call __sb_start_write.
> 
> >
> >CAI Qian
> >
> > - Original Message -
> >> From: "CAI Qian" 
> >> To: secur...@kernel.org
> >> Cc: "Miklos Szeredi" , "Eric Sandeen"
> >> 
> >> Sent: Tuesday, August 30, 2016 5:05:45 PM
> >> Subject: Re: possible circular locking dependency detected
> >> 
> >> FYI, this one can only be reproduced using the overlayfs docker backend.
> >> The device-mapper works fine. The XFS below has ftype=1.
> >> 
> >> # cp recvmsg01 /mnt
> >> # docker run -it -v /mnt/:/mnt/ rhel7 bash
> >> [root@c33c99aedd93 /]# mount
> >> overlay on / type overlay
> >> (rw,relatime,seclabel,lowerdir=l/I5VXL74ENBNAEARZ4M2SIN3XD6:l/KZGBKPXLDXUGHYWMERFUBM4FRP,upperdir=9a7c1f735166b1f63d220b4b6c59cc37f3922719ef810c97182b814c1ab336df/diff,workdir=9a7c1f735166b1f63d220b4b6c59cc37f3922719ef810c97182b814c1ab336df/work)
> >> ...
> >> [root@c33c99aedd93 /]# /mnt/recvmsg01
> >> CAI Qian
> >> 
> >> - Original Message -
> >> > From: "CAI Qian" 
> >> > To: secur...@kernel.org
> >> > Sent: Friday, August 26, 2016 10:50:57 AM
> >> > Subject: possible circular locking dependency detected
> >> > 
> >> > FYI, just want to give a head up to see if there is anything obvious so
> >> > we can avoid a possible DoS somehow.
> >> > 
> >> > Running the LTP syscalls tests inside a container until this test
> >> > trigger
> >> > below,
> >> > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/recvmsg/recvmsg01.c
> >> > 
> >> > [ 4441.904103] open04 (42409) used greatest stack depth: 20552 bytes
> >> > left
> >> > [ 4605.419167]
> >> > [ 4605.420831] ==
> >> > [ 4605.427727] [ INFO: possible circular locking dependency detected ]
> >> > [ 4605.434720] 4.8.0-rc3+ #3 Not tainted
> >> > [ 4605.438803] ---
> >> > [ 4605.445796] recvmsg01/42878 is trying to acquire lock:
> >> > [ 4605.451528]  (sb_writers#8){.+.+.+}, at: []
> >> > __sb_start_write+0xb4/0xf0
> >> > [ 4605.460642]
> >> > [ 4605.460642] but task is already holding lock:
> >> > [ 4605.467150]  (&u->readlock){+.+.+.}, at: []
> >> > unix_bind+0x299/0xdf0
> >> > [ 4605.475749]
> >> > [ 4605.475749] which lock already depends on the new lock.
> >> > [ 4605.475749]
> >> > [ 4605.484882]
> >> > [ 4605.484882] the existing dependency chain (in

Re: possible circular locking dependency detected (bisected)

2016-08-31 Thread CAI Qian
Reverted the patch below fixes this problem.

c845acb324aa85a39650a14e7696982ceea75dc1
af_unix: Fix splice-bind deadlock

   CAI Qian

- Original Message -
> From: "CAI Qian" 
> To: secur...@kernel.org
> Cc: "Miklos Szeredi" , "Eric Sandeen" 
> 
> Sent: Tuesday, August 30, 2016 5:05:45 PM
> Subject: Re: possible circular locking dependency detected
> 
> FYI, this one can only be reproduced using the overlayfs docker backend.
> The device-mapper works fine. The XFS below has ftype=1.
> 
> # cp recvmsg01 /mnt
> # docker run -it -v /mnt/:/mnt/ rhel7 bash
> [root@c33c99aedd93 /]# mount
> overlay on / type overlay
> (rw,relatime,seclabel,lowerdir=l/I5VXL74ENBNAEARZ4M2SIN3XD6:l/KZGBKPXLDXUGHYWMERFUBM4FRP,upperdir=9a7c1f735166b1f63d220b4b6c59cc37f3922719ef810c97182b814c1ab336df/diff,workdir=9a7c1f735166b1f63d220b4b6c59cc37f3922719ef810c97182b814c1ab336df/work)
> ...
> [root@c33c99aedd93 /]# /mnt/recvmsg01
> CAI Qian
> 
> - Original Message -
> > From: "CAI Qian" 
> > To: secur...@kernel.org
> > Sent: Friday, August 26, 2016 10:50:57 AM
> > Subject: possible circular locking dependency detected
> > 
> > FYI, just want to give a head up to see if there is anything obvious so
> > we can avoid a possible DoS somehow.
> > 
> > Running the LTP syscalls tests inside a container until this test trigger
> > below,
> > https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/recvmsg/recvmsg01.c
> > 
> > [ 4441.904103] open04 (42409) used greatest stack depth: 20552 bytes left
> > [ 4605.419167]
> > [ 4605.420831] ==
> > [ 4605.427727] [ INFO: possible circular locking dependency detected ]
> > [ 4605.434720] 4.8.0-rc3+ #3 Not tainted
> > [ 4605.438803] ---
> > [ 4605.445796] recvmsg01/42878 is trying to acquire lock:
> > [ 4605.451528]  (sb_writers#8){.+.+.+}, at: []
> > __sb_start_write+0xb4/0xf0
> > [ 4605.460642]
> > [ 4605.460642] but task is already holding lock:
> > [ 4605.467150]  (&u->readlock){+.+.+.}, at: []
> > unix_bind+0x299/0xdf0
> > [ 4605.475749]
> > [ 4605.475749] which lock already depends on the new lock.
> > [ 4605.475749]
> > [ 4605.484882]
> > [ 4605.484882] the existing dependency chain (in reverse order) is:
> > [ 4605.493234]
> > [ 4605.493234] -> #2 (&u->readlock){+.+.+.}:
> > [ 4605.497943][] lock_acquire+0x1fa/0x440
> > [ 4605.504659][]
> > mutex_lock_interruptible_nested+0xdd/0x920
> > [ 4605.513119][] unix_bind+0x299/0xdf0
> > [ 4605.519540][] SYSC_bind+0x1d8/0x240
> > [ 4605.525964][] SyS_bind+0xe/0x10
> > [ 4605.531998][] do_syscall_64+0x1a6/0x500
> > [ 4605.538811][] return_from_SYSCALL_64+0x0/0x7a
> > [ 4605.546203]
> > [ 4605.546203] -> #1 (&type->i_mutex_dir_key#3/1){+.+.+.}:
> > [ 4605.552292][] lock_acquire+0x1fa/0x440
> > [ 4605.559002][] down_write_nested+0x5e/0xe0
> > [ 4605.566008][] filename_create+0x155/0x470
> > [ 4605.573013][] SyS_mkdir+0xaf/0x1f0
> > [ 4605.579339][]
> > entry_SYSCALL_64_fastpath+0x1f/0xbd
> > [ 4605.587119]
> > [ 4605.587119] -> #0 (sb_writers#8){.+.+.+}:
> > [ 4605.591835][] __lock_acquire+0x3043/0x3dd0
> > [ 4605.598935][] lock_acquire+0x1fa/0x440
> > [ 4605.605646][] percpu_down_read+0x4f/0xa0
> > [ 4605.612552][] __sb_start_write+0xb4/0xf0
> > [ 4605.619459][] mnt_want_write+0x41/0xb0
> > [ 4605.626173][] ovl_want_write+0x76/0xa0
> > [overlay]
> > [ 4605.633860][] ovl_create_object+0xa3/0x2d0
> > [overlay]
> > [ 4605.641942][] ovl_mknod+0x31/0x40 [overlay]
> > [ 4605.649138][] vfs_mknod+0x34b/0x560
> > [ 4605.655570][] unix_bind+0x4ca/0xdf0
> > [ 4605.661991][] SYSC_bind+0x1d8/0x240
> > [ 4605.668412][] SyS_bind+0xe/0x10
> > [ 4605.674456][] do_syscall_64+0x1a6/0x500
> > [ 4605.681266][] return_from_SYSCALL_64+0x0/0x7a
> > [ 4605.688657]
> > [ 4605.688657] other info that might help us debug this:
> > [ 4605.688657]
> > [ 4605.697590] Chain exists of:
> > [ 4605.697590]   sb_writers#8 --> &type->i_mutex_dir_key#3/1 -->
> > &u->readlock
> > [ 4605.697590]
> > [ 4605.707287]  Possible unsafe locking scenario:
> > [ 4605.707287]
> > [ 4605.713890]CPU0CPU1
> > [ 4605.718943] 

Re: [PATCH net] rhashtable: fix a memory leak in alloc_bucket_locks()

2016-08-26 Thread CAI Qian
After applied to this patch and ran the reproducer (compiling gcc), had
no bucket_table_alloc in kmemleak report anymore. Hence,

Tested-by: CAI Qian 

Funny enough, it now gave me this,

[ 3406.807461] kmemleak: 1353 new suspected memory leaks (see 
/sys/kernel/debug/kmemleak)

http://people.redhat.com/qcai/tmp/kmemleak.log

   CAI Qian

- Original Message -
> From: "Eric Dumazet" 
> To: "David Miller" 
> Cc: "CAI Qian" , "Thomas Graf" , "Herbert 
> Xu" , "Eric
> Dumazet" , "Network Development" 
> , "Linus Torvalds"
> , "Florian Westphal" 
> Sent: Friday, August 26, 2016 11:51:39 AM
> Subject: [PATCH net] rhashtable: fix a memory leak in alloc_bucket_locks()
> 
> From: Eric Dumazet 
> 
> If vmalloc() was successful, do not attempt a kmalloc_array()
> 
> Fixes: 4cf0b354d92e ("rhashtable: avoid large lock-array allocations")
> Reported-by: CAI Qian 
> Signed-off-by: Eric Dumazet 
> Cc: Florian Westphal 
> ---
>  lib/rhashtable.c |7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/rhashtable.c b/lib/rhashtable.c
> index 5ba520b544d7..56054e541a0f 100644
> --- a/lib/rhashtable.c
> +++ b/lib/rhashtable.c
> @@ -77,17 +77,18 @@ static int alloc_bucket_locks(struct rhashtable *ht,
> struct bucket_table *tbl,
>   size = min_t(unsigned int, size, tbl->size >> 1);
>  
>   if (sizeof(spinlock_t) != 0) {
> + tbl->locks = NULL;
>  #ifdef CONFIG_NUMA
>   if (size * sizeof(spinlock_t) > PAGE_SIZE &&
>   gfp == GFP_KERNEL)
>   tbl->locks = vmalloc(size * sizeof(spinlock_t));
> - else
>  #endif
>   if (gfp != GFP_KERNEL)
>   gfp |= __GFP_NOWARN | __GFP_NORETRY;
>  
> - tbl->locks = kmalloc_array(size, sizeof(spinlock_t),
> -gfp);
> + if (!tbl->locks)
> + tbl->locks = kmalloc_array(size, sizeof(spinlock_t),
> +gfp);
>   if (!tbl->locks)
>   return -ENOMEM;
>   for (i = 0; i < size; i++)
> 
> 
> 


Re: possible memory leak in ipc

2016-08-26 Thread CAI Qian


- Original Message -
> From: "Linus Torvalds" 
> To: "CAI Qian" , "Thomas Graf" , "Herbert 
> Xu" 
> Cc: "Eric Dumazet" , "Network Development" 
> 
> Sent: Thursday, August 25, 2016 6:20:03 PM
> Subject: Re: possible memory leak in ipc
> 
> On Thu, Aug 25, 2016 at 1:17 PM, CAI Qian  wrote:
> > I am unsure if it is really a memleak (could be a security issue due to
> > eventually OOM and DoS) or just a soft lockup with in kmemlock code with
> > false alarm.
> 
> Hmm. The reported leaks look like
> 
> unreferenced object 0xc90004857000 (size 4608):
>   comm "kworker/16:0", pid 110, jiffies 4294705908 (age 883.925s)
>   hex dump (first 32 bytes):
> c0 05 3d 5e 08 88 ff ff ff ff ff ff 00 00 dc 6e  ..=^...n
> ff ff ff ff ff ff ff ff 28 c7 46 83 ff ff ff ff  (.F.
>   backtrace:
> [] kmemleak_alloc+0x4a/0xa0
> [] __vmalloc_node_range+0x1de/0x2f0
> [] vmalloc+0x54/0x60
> [] alloc_bucket_locks.isra.7+0xd4/0xf0
> [] bucket_table_alloc+0x58/0x100
> [] rht_deferred_worker+0x10e/0x890
> [] process_one_work+0x218/0x750
> [] worker_thread+0x125/0x4a0
> [] kthread+0x101/0x120
> [] ret_from_fork+0x1f/0x40
> [] 0x
> 
> which would indicate that it's a rhashtable resize event where we
> perhaps haven't free'd the old hash table when we create a new one.
> 
> The actually freeing of the old one is done RCU-deferred from
> rhashtable_rehash_table(), but that itself is also deferred by a
> worker thread (rht_deferred_worker).
> 
> I'm not seeing anything wrong in the logic, but let's bring in Thomas
> Graf and Herbert Xu.
> 
> Hmm. The size (4608) is always the same and doesn't change, so maybe
> it's not actually a rehash events per se - it's somebody creating a
> rhashtable, but perhaps not freeing it?
> 
> Sadly, all but one of the traces are that kthread one, and the one
> that isn't that might give an idea about what code triggers this is:
> 
> unreferenced object 0xc900048b6000 (size 4608):
>   comm "modprobe", pid 2485, jiffies 4294727633 (age 862.590s)
>   hex dump (first 32 bytes):
> 00 9c 49 21 00 ea ff ff 00 d5 59 21 00 ea ff ff  ..I!..Y!
> 00 a5 7d 21 00 ea ff ff c0 da 74 21 00 ea ff ff  ..}!..t!
>   backtrace:
> [] kmemleak_alloc+0x4a/0xa0
> [] __vmalloc_node_range+0x1de/0x2f0
> [] vmalloc+0x54/0x60
> [] alloc_bucket_locks.isra.7+0xd4/0xf0
> [] bucket_table_alloc+0x58/0x100
> [] rhashtable_init+0x1ed/0x390
> [] 0xa05b201b
> [] do_one_initcall+0x50/0x190
> [] do_init_module+0x60/0x1f3
> [] load_module+0x1487/0x1ca0
> [] SYSC_finit_module+0xa6/0xf0
> [] SyS_finit_module+0xe/0x10
> [] do_syscall_64+0x6c/0x1e0
> [] return_from_SYSCALL_64+0x0/0x7a
> [] 0x
> 
> so it comes from some module init code, but since the module hasn't
> fully initialized, the kallsym code doesn't find the symbol name
> either. Annoying.
> 
> Maybe the above just makes one of the rhashtable people go "Oh, that's
> obvious".
> 
>  Linus
> 
FYI, this also happened while compiling gcc.
$ make -j 56

$ cat /sys/kernel/debug/kmemleak
unreferenced object 0xc9000485a000 (size 4608):
  comm "kworker/7:1", pid 368, jiffies 4294835499 (age 1033.075s)
  hex dump (first 32 bytes):
5f 65 76 65 6e 74 5f 69 74 65 6d 2e 68 00 05 00  _event_item.h...
00 76 6d 73 74 61 74 2e 68 00 05 00 00 70 6c 69  .vmstat.hpli
  backtrace:
[] kmemleak_alloc+0x4a/0xa0
[] __vmalloc_node_range+0x378/0x700
[] vmalloc+0x54/0x60
[] alloc_bucket_locks.isra.7+0x188/0x220
[] bucket_table_alloc+0xac/0x290
[] rht_deferred_worker+0x8c5/0x1890
[] process_one_work+0x731/0x16c0
[] worker_thread+0xdc/0xf10
[] kthread+0x223/0x2f0
[] ret_from_fork+0x1f/0x40
[] 0x
unreferenced object 0xc90004861000 (size 4608):
  comm "kworker/10:1", pid 380, jiffies 4294843418 (age 1025.169s)
  hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[] kmemleak_alloc+0x4a/0xa0
[] __vmalloc_node_range+0x378/0x700
[] vmalloc+0x54/0x60
[] alloc_bucket_locks.isra.7+0x188/0x220
[] bucket_table_alloc+0xac/0x290
[] rht_deferred_worker+0x15d4/0x1890
[] process_one_work+0x731/0x16c0
[] worker_thread+0xdc/0xf10
[] kthread+0x223/0x2f0
[] ret_from_fork+0x1f/0x40
[] 0x
unreferenced object 0xc90004864000 (size 4608):
  com