Hi Jason, I think the problem is in "tun: hold a tun socket during ptr_ring_cleanup".
Pls take a look at the attached patch. On Tue, May 15, 2018 at 11:28:25PM -0700, Andrei Vagin wrote: > We run CRIU tests on linux-next regularly and today we caught this bug: > > https://travis-ci.org/avagin/linux/jobs/379450631 > > [ 50.264837] > ================================================================== > [ 50.264986] BUG: KASAN: use-after-free in > __lock_acquire.isra.30+0x1ad4/0x1bb0 > [ 50.265088] Read of size 8 at addr ffff88018e1728f8 by task criu/1819 > [ 50.265167] > [ 50.265249] CPU: 0 PID: 1819 Comm: criu Not tainted > 4.17.0-rc5-next-20180515+ #1 > [ 50.265251] Hardware name: Google Google Compute Engine/Google Compute > Engine, BIOS Google 01/01/2011 > [ 50.265252] Call Trace: > [ 50.265262] dump_stack+0x71/0xab > [ 50.265265] ? __lock_acquire.isra.30+0x1ad4/0x1bb0 > [ 50.265271] print_address_description+0x6a/0x270 > [ 50.265273] ? __lock_acquire.isra.30+0x1ad4/0x1bb0 > [ 50.265275] kasan_report+0x237/0x360 > [ 50.265278] __lock_acquire.isra.30+0x1ad4/0x1bb0 > [ 50.265285] ? register_netdev+0x30/0x30 > [ 50.265288] lock_acquire+0x10b/0x2a0 > [ 50.265294] ? tun_chr_close+0x1d7/0x4c0 > [ 50.265298] ? kfree+0xd6/0x1f0 > [ 50.265303] _raw_spin_lock+0x25/0x30 > [ 50.265306] ? tun_chr_close+0x1d7/0x4c0 > [ 50.265308] tun_chr_close+0x1d7/0x4c0 > [ 50.265313] ? fcntl_setlk+0xaf0/0xaf0 > [ 50.265320] __fput+0x251/0x770 > [ 50.265324] task_work_run+0x10e/0x180 > [ 50.265330] exit_to_usermode_loop+0xcb/0xf0 > [ 50.265332] do_syscall_64+0x21d/0x280 > [ 50.265335] ? prepare_exit_to_usermode+0x88/0x130 > [ 50.265338] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 50.265342] RIP: 0033:0x1494fa6f93f0 > [ 50.265342] Code: 73 01 c3 48 8b 0d b8 9b 20 00 f7 d8 64 89 01 48 83 c8 ff > c3 66 0f 1f 44 00 00 83 3d 59 e0 20 00 00 75 10 b8 03 00 00 00 0f 05 <48> 3d > 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 0e fa ff ff 48 89 04 24 > [ 50.265388] RSP: 002b:00007ffd229fe7f8 EFLAGS: 00000246 ORIG_RAX: > 0000000000000003 > [ 50.265391] RAX: 0000000000000000 RBX: 0000000000000004 RCX: > 00001494fa6f93f0 > [ 50.265393] RDX: 00007ffd229fe80c RSI: 00000000400454da RDI: > 0000000000000004 > [ 50.265395] RBP: 0000000000000000 R08: 000000000000420b R09: > 0000000000000000 > [ 50.265396] R10: 0000000000000000 R11: 0000000000000246 R12: > 00001494fab116a0 > [ 50.265398] R13: 0000000000000d06 R14: 0000000000000000 R15: > 0000000000000000 > [ 50.265400] > [ 50.265476] Allocated by task 1819: > [ 50.265554] kasan_kmalloc+0xa0/0xd0 > [ 50.265556] __kmalloc+0x13a/0x250 > [ 50.265561] sk_prot_alloc+0xd3/0x250 > [ 50.265564] sk_alloc+0x35/0x9d0 > [ 50.265566] tun_chr_open+0x7b/0x5a0 > [ 50.265570] misc_open+0x313/0x480 > [ 50.265573] chrdev_open+0x1d6/0x4b0 > [ 50.265575] do_dentry_open+0x6ae/0xee0 > [ 50.265578] path_openat+0xce6/0x2890 > [ 50.265580] do_filp_open+0x17a/0x270 > [ 50.265582] do_sys_open+0x203/0x340 > [ 50.265584] do_syscall_64+0xa0/0x280 > [ 50.265586] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 50.265587] > [ 50.265667] Freed by task 1819: > [ 50.265745] __kasan_slab_free+0x130/0x180 > [ 50.265747] kfree+0xd6/0x1f0 > [ 50.265750] __sk_destruct+0x46f/0x580 > [ 50.265752] tun_chr_close+0x330/0x4c0 > [ 50.265754] __fput+0x251/0x770 > [ 50.265756] task_work_run+0x10e/0x180 > [ 50.265758] exit_to_usermode_loop+0xcb/0xf0 > [ 50.265760] do_syscall_64+0x21d/0x280 > [ 50.265762] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 50.265762] > [ 50.265840] The buggy address belongs to the object at ffff88018e172200 > [ 50.265840] which belongs to the cache kmalloc-2048 of size 2048 > [ 50.265927] The buggy address is located 1784 bytes inside of > [ 50.265927] 2048-byte region [ffff88018e172200, ffff88018e172a00) > [ 50.266011] The buggy address belongs to the page: > [ 50.266089] page:ffffea0006385c00 count:1 mapcount:0 > mapping:0000000000000000 index:0x0 compound_mapcount: 0 > [ 50.266178] flags: 0x17fff8000008100(slab|head) > [ 50.266257] raw: 017fff8000008100 0000000000000000 0000000000000000 > 00000001800f000f > [ 50.266342] raw: dead000000000100 dead000000000200 ffff8801d9016800 > 0000000000000000 > [ 50.266425] page dumped because: kasan: bad access detected > [ 50.266501] > [ 50.266590] Memory state around the buggy address: > [ 50.266693] ffff88018e172780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb > fb fb > [ 50.266776] ffff88018e172800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb > fb fb > [ 50.266860] >ffff88018e172880: fb fb fb fb fb fb fb fb fb fb fb fb fb fb > fb fb > [ 50.266943] > ^ > [ 50.267020] ffff88018e172900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb > fb fb > [ 50.267103] ffff88018e172980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb > fb fb > [ 50.267192] > ================================================================== > [ 50.267275] Disabling lock debugging due to kernel taint
>From 7efa9d087be20a67c5c3953f7bf26ae5bafaa061 Mon Sep 17 00:00:00 2001 From: Andrei Vagin <ava...@openvz.org> Date: Wed, 16 May 2018 00:03:39 -0700 Subject: [PATCH] tun: hold a tun socket during ptr_ring_cleanup Otherwise a socket will be destroyed together with a tun_file structure, which is used in ptr_ring_cleanup. This issue was reported by kasan: BUG: KASAN: use-after-free in __lock_acquire.isra.30+0x1ad4/0x1bb0 Read of size 8 at addr ffff88018e1728f8 by task criu/1819 CPU: 0 PID: 1819 Comm: criu Not tainted 4.17.0-rc5-next-20180515+ #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: dump_stack+0x71/0xab print_address_description+0x6a/0x270 kasan_report+0x237/0x360 __lock_acquire.isra.30+0x1ad4/0x1bb0 lock_acquire+0x10b/0x2a0 _raw_spin_lock+0x25/0x30 tun_chr_close+0x1d7/0x4c0 __fput+0x251/0x770 task_work_run+0x10e/0x180 exit_to_usermode_loop+0xcb/0xf0 do_syscall_64+0x21d/0x280 ? prepare_exit_to_usermode+0x88/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 1819: __kasan_slab_free+0x130/0x180 kfree+0xd6/0x1f0 __sk_destruct+0x46f/0x580 tun_chr_close+0x330/0x4c0 __fput+0x251/0x770 task_work_run+0x10e/0x180 exit_to_usermode_loop+0xcb/0xf0 do_syscall_64+0x21d/0x280 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Andrei Vagin <ava...@openvz.org> --- drivers/net/tun.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 8b0f0a0baab4..f3eae203cc58 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -3246,8 +3246,10 @@ static int tun_chr_close(struct inode *inode, struct file *file) { struct tun_file *tfile = file->private_data; + sock_hold(&tfile->sk); tun_detach(tfile, true); ptr_ring_cleanup(&tfile->tx_ring, tun_ptr_free); + sock_put(&tfile->sk); return 0; } -- 2.17.0