Re: INFO: task hung in __get_super

2019-05-16 Thread Jan Kara
On Thu 16-05-19 21:17:14, Tetsuo Handa wrote:
> On 2019/05/16 20:48, Jan Kara wrote:
> > OK, so non-racy fix was a bit more involved and I've ended up just
> > upgrading the file reference to an exclusive one in loop_set_fd() instead
> > of trying to hand-craft some locking solution. The result is attached and
> > it passes blktests.
> 
> blkdev_get() has corresponding blkdev_put().
> bdgrab() does not have corresponding bdput() ?

Yes, and that's hidden inside blkdev_put() (or failing blkdev_get()). Don't
get me started on calling conventions of these functions... I've wasted half
an hour trying to figure out where I'm leaking inode references in my patch
;).

Honza

-- 
Jan Kara 
SUSE Labs, CR


Re: INFO: task hung in __get_super

2019-05-15 Thread syzbot

Hello,

syzbot has tested the proposed patch and the reproducer did not trigger  
crash:


Reported-and-tested-by:  
syzbot+10007d66ca02b08f0...@syzkaller.appspotmail.com


Tested on:

commit: e93c9c99 Linux 5.1
git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git v5.1

kernel config:  https://syzkaller.appspot.com/x/.config?x=5edd1df52e9bc982
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
patch:  https://syzkaller.appspot.com/x/patch.diff?x=133626d8a0

Note: testing is done by a robot and is best-effort only.


Re: INFO: task hung in __get_super

2019-05-15 Thread Tetsuo Handa
On 2019/05/15 19:21, Jan Kara wrote:
> The question is how to fix this problem. The simplest fix I can see is that
> we'd just refuse to do LOOP_SET_FD if someone has the block device
> exclusively open as there are high chances such user will be unpleasantly
> surprised by the device changing under him. OTOH this has some potential
> for userspace visible regressions. But I guess it's worth a try. Something
> like attached patch?

(1) If I understand correctly, FMODE_EXCL is set at blkdev_open() only if O_EXCL
is specified. How can we detect if O_EXCL was not used, for the reproducer
( https://syzkaller.appspot.com/text?tag=ReproC=135385a8a0 ) is not
using O_EXCL ?

(2) There seems to be no serialization. What guarantees that mount_bdev()
does not start due to preempted after the check added by this patch?


Re: INFO: task hung in __get_super

2019-05-14 Thread Tetsuo Handa
Since lo_ioctl() is calling sb_set_blocksize() immediately after udf_load_vrs()
called sb_set_blocksize(), udf_tread() can't use expected i_blkbits settings...

[   48.685672][ T7322] fs/block_dev.c:135 bdev=14fa0ec2 12 -> 9
[   48.694675][ T7322] CPU: 4 PID: 7322 Comm: a.out Not tainted 5.1.0+ #196
[   48.701321][ T7322] Hardware name: VMware, Inc. VMware Virtual 
Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[   48.710265][ T7322] Call Trace:
[   48.710272][ T7322]  dump_stack+0xaa/0xd8
[   48.715633][ T7322]  set_blocksize+0xff/0x140
[   48.822094][ T7322]  sb_set_blocksize+0x27/0x70
[   48.824843][ T7322]  udf_load_vrs+0x4b/0x500
[   48.827322][ T7322]  udf_fill_super+0x32e/0x890
[   48.830125][ T7322]  ? snprintf+0x66/0x90
[   48.832572][ T7322]  mount_bdev+0x1c7/0x210
[   48.835293][ T7322]  ? udf_load_vrs+0x500/0x500
[   48.838009][ T7322]  udf_mount+0x34/0x40
[   48.840504][ T7322]  legacy_get_tree+0x2d/0x80
[   48.843192][ T7322]  vfs_get_tree+0x30/0x140
[   48.845787][ T7322]  do_mount+0x830/0xc30
[   48.848325][ T7322]  ? copy_mount_options+0x152/0x1c0
[   48.851066][ T7322]  ksys_mount+0xab/0x120
[   48.853627][ T7322]  __x64_sys_mount+0x26/0x30
[   48.856168][ T7322]  do_syscall_64+0x7c/0x1a0
[   48.858943][ T7322]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[   48.978376][ T7332] fs/block_dev.c:135 bdev=14fa0ec2 9 -> 12
[   49.079394][ T7332] CPU: 6 PID: 7332 Comm: a.out Not tainted 5.1.0+ #196
[   49.082769][ T7332] Hardware name: VMware, Inc. VMware Virtual 
Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[   49.089007][ T7332] Call Trace:
[   49.091410][ T7332]  dump_stack+0xaa/0xd8
[   49.094053][ T7332]  set_blocksize+0xff/0x140
[   49.096734][ T7332]  lo_ioctl+0x570/0xc60
[   49.099366][ T7332]  ? loop_queue_work+0xdb0/0xdb0
[   49.102079][ T7332]  blkdev_ioctl+0xb69/0xc10
[   49.104667][ T7332]  block_ioctl+0x56/0x70
[   49.107267][ T7332]  ? blkdev_fallocate+0x230/0x230
[   49.110035][ T7332]  do_vfs_ioctl+0xc1/0x7e0
[   49.112728][ T7332]  ? tomoyo_file_ioctl+0x23/0x30
[   49.115452][ T7332]  ksys_ioctl+0x94/0xb0
[   49.118008][ T7332]  __x64_sys_ioctl+0x1e/0x30
[   49.120686][ T7332]  do_syscall_64+0x7c/0x1a0
[   49.123470][ T7332]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

[   49.163269][ T7322] UDF-fs: error (device loop2): udf_read_tagged: read 
failed, block=256, location=256
[   51.009494][ T7322] grow_dev_page: 5 callbacks suppressed
[   51.009509][ T7322] fs/buffer.c:960 bdev=14fa0ec2 block=32 
inode->i_blkbits=12 index=4 size=512
[   51.016583][ T7322] fs/buffer.c:967 bdev=14fa0ec2 block=32 index=4 
size=512 bh->b_size=512

[   55.046361][ T7322] grow_dev_page: 1177059 callbacks suppressed
[   55.046366][ T7322] fs/buffer.c:960 bdev=14fa0ec2 block=32 
inode->i_blkbits=12 index=4 size=512
[   55.054903][ T7322] grow_dev_page: 1177037 callbacks suppressed
[   55.054906][ T7322] fs/buffer.c:967 bdev=14fa0ec2 block=32 index=4 
size=512 bh->b_size=512
[   56.055187][ T7322] grow_dev_page: 1279065 callbacks suppressed
[   56.055191][ T7322] fs/buffer.c:960 bdev=14fa0ec2 block=32 
inode->i_blkbits=12 index=4 size=512
[   56.064443][ T7322] grow_dev_page: 1279065 callbacks suppressed
[   56.064446][ T7322] fs/buffer.c:967 bdev=14fa0ec2 block=32 index=4 
size=512 bh->b_size=512

[   92.282798][ T1097] INFO: task a.out:7322 can't die for more than 30 seconds.
[   92.285926][ T1097] a.out   R  running task0  7322  1 
0x80004004
[   92.289143][ T1097] Call Trace:
[   92.291175][ T1097]  __schedule+0x2eb/0x600
[   92.293354][ T1097]  ? apic_timer_interrupt+0xa/0x20
[   92.295802][ T1097]  ? xas_load+0x13/0xd0
[   92.298079][ T1097]  ? __switch_to_asm+0x40/0x70
[   92.300466][ T1097]  ? __switch_to_asm+0x34/0x70
[   92.302890][ T1097]  ? __switch_to_asm+0x40/0x70
[   92.305299][ T1097]  ? __switch_to_asm+0x34/0x70
[   92.307621][ T1097]  ? __switch_to_asm+0x40/0x70
[   92.309908][ T1097]  ? __switch_to_asm+0x34/0x70
[   92.312161][ T1097]  ? debug_smp_processor_id+0x2b/0x130
[   92.314532][ T1097]  ? xas_start+0xa7/0x110
[   92.317139][ T1097]  ? xas_load+0x3d/0xd0
[   92.319674][ T1097]  ? xas_start+0xa7/0x110
[   92.321746][ T1097]  ? xas_load+0x3d/0xd0
[   92.323831][ T1097]  ? find_get_entry+0x172/0x220
[   92.326115][ T1097]  ? __this_cpu_preempt_check+0x37/0x120
[   92.328714][ T1097]  ? __find_get_block+0x485/0x610
[   92.331096][ T1097]  ? _raw_spin_trylock+0x12/0x50
[   92.333402][ T1097]  ? __getblk_gfp+0xd9/0x4a0
[   92.335617][ T1097]  ? __bread_gfp+0x2d/0x130
[   92.338410][ T1097]  ? udf_tread+0x48/0x70
[   92.341144][ T1097]  ? udf_read_tagged+0x40/0x1a0
[   92.343537][ T1097]  ? udf_check_anchor_block+0x94/0x190
[   92.346054][ T1097]  ? udf_scan_anchors+0x12b/0x250
[   92.348480][ T1097]  ? udf_load_vrs+0x2b7/0x500
[   92.350732][ T1097]  ? udf_fill_super+0x32e/0x890
[   92.353172][ T1097]  ? snprintf+0x66/0x90
[   92.355256][ T1097]  ? 

Re: INFO: task hung in __get_super

2019-05-13 Thread Tetsuo Handa
Hello, Jan.

syzbot is still reporting livelocks inside __getblk_gfp() [1] (similar to
commit 04906b2f542c2362 ("blockdev: Fix livelocks on loop device")).

  [1] 
https://syzkaller.appspot.com/bug?id=835a0b9e75b14b55112661cbc61ca8b8f0edf767

A debug printk() patch shown below revealed that since bdev->bd_inode->i_blkbits
is unexpectedly modified, __find_get_block_slow() is failing to find buffer 
pages
created by grow_dev_page(). I guess that opening a loop device (while somebody 
is
doing mount operation on that loop device) can trigger set_init_blocksize() from
__blkdev_get().


diff --git a/fs/buffer.c b/fs/buffer.c
index 0faa41fb4c88..8e197c771476 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -190,7 +190,7 @@ EXPORT_SYMBOL(end_buffer_write_sync);
  * succeeds, there is no need to take private_lock.
  */
 static struct buffer_head *
-__find_get_block_slow(struct block_device *bdev, sector_t block)
+__find_get_block_slow(struct block_device *bdev, sector_t block, unsigned size)
 {
struct inode *bd_inode = bdev->bd_inode;
struct address_space *bd_mapping = bd_inode->i_mapping;
@@ -204,12 +204,20 @@ __find_get_block_slow(struct block_device *bdev, sector_t 
block)
 
index = block >> (PAGE_SHIFT - bd_inode->i_blkbits);
page = find_get_page_flags(bd_mapping, index, FGP_ACCESSED);
-   if (!page)
+   if (!page) {
+   if (fatal_signal_pending(current) && __ratelimit(_warned))
+   printk("%s:%u block=%llu bd_inode->i_blkbits=%u 
index=%lu size=%u\n",
+  __FILE__, __LINE__, block, bd_inode->i_blkbits, 
index, size);
goto out;
+   }
 
spin_lock(_mapping->private_lock);
-   if (!page_has_buffers(page))
+   if (!page_has_buffers(page)) {
+   if (fatal_signal_pending(current) && __ratelimit(_warned))
+   printk("%s:%u block=%llu bd_inode->i_blkbits=%u 
index=%lu size=%u page=%p\n",
+  __FILE__, __LINE__, block, bd_inode->i_blkbits, 
index, size, page);
goto out_unlock;
+   }
head = page_buffers(page);
bh = head;
do {
@@ -934,6 +942,8 @@ grow_dev_page(struct block_device *bdev, sector_t block,
sector_t end_block;
int ret = 0;/* Will call free_more_memory() */
gfp_t gfp_mask;
+   static DEFINE_RATELIMIT_STATE(last_warned1, HZ, 1);
+   static DEFINE_RATELIMIT_STATE(last_warned2, HZ, 1);
 
gfp_mask = mapping_gfp_constraint(inode->i_mapping, ~__GFP_FS) | gfp;
 
@@ -946,11 +956,15 @@ grow_dev_page(struct block_device *bdev, sector_t block,
gfp_mask |= __GFP_NOFAIL;
 
page = find_or_create_page(inode->i_mapping, index, gfp_mask);
+   if (fatal_signal_pending(current) && __ratelimit(_warned1))
+   printk("%s:%u block=%llu inode->i_blkbits=%u index=%lu 
size=%u\n", __FILE__, __LINE__, block, inode->i_blkbits, index, size);
 
BUG_ON(!PageLocked(page));
 
if (page_has_buffers(page)) {
bh = page_buffers(page);
+   if (fatal_signal_pending(current) && __ratelimit(_warned2))
+   printk("%s:%u block=%llu index=%lu size=%u 
bh->b_size=%lu\n", __FILE__, __LINE__, block, index, size, bh->b_size);
if (bh->b_size == size) {
end_block = init_page_buffers(page, bdev,
(sector_t)index << sizebits,
@@ -1292,7 +1306,7 @@ __find_get_block(struct block_device *bdev, sector_t 
block, unsigned size)
 
if (bh == NULL) {
/* __find_get_block_slow will mark the page accessed */
-   bh = __find_get_block_slow(bdev, block);
+   bh = __find_get_block_slow(bdev, block, size);
if (bh)
bh_lru_install(bh);
} else



[   89.805981][ T7388] __find_get_block_slow: 1278192 callbacks suppressed
[   89.805987][ T7388] fs/buffer.c:218 block=1 bd_inode->i_blkbits=12 index=1 
size=512 page=cd89a24d
[   90.716164][ T7388] grow_dev_page: 1295496 callbacks suppressed
[   90.716167][ T7388] fs/buffer.c:960 block=1 inode->i_blkbits=12 index=0 
size=512
[   90.735989][ T7388] grow_dev_page: 1296140 callbacks suppressed
[   90.735991][ T7388] fs/buffer.c:967 block=1 index=0 size=512 bh->b_size=512
[   90.815859][ T7388] __find_get_block_slow: 1312165 callbacks suppressed
[   90.815862][ T7388] fs/buffer.c:218 block=1 bd_inode->i_blkbits=12 index=1 
size=512 page=cd89a24d
[   91.726006][ T7388] grow_dev_page: 1306138 callbacks suppressed
[   91.726022][ T7388] fs/buffer.c:960 block=1 inode->i_blkbits=12 index=0 
size=512
[   91.745856][ T7388] grow_dev_page: 1305423 callbacks suppressed
[   91.745859][ T7388] fs/buffer.c:967 block=1 index=0 size=512 bh->b_size=512
[   

Re: INFO: task hung in __get_super

2019-05-03 Thread Tetsuo Handa
On 2019/05/01 0:34, Tetsuo Handa wrote:
> I still cannot understand what the problem is.
(...snipped...)
> I guessed that something went wrong with 8083 inside __getblk_gfp().
> How can loop_ctl_mutex be relevant to this problem?
> 

syzbot got similar NMI backtrace. No loop_ctl_mutex is involved.

INFO: task hung in mount_bdev (2)
https://syzkaller.appspot.com/bug?id=d9b9fa1428ff2466de64fc85256e769f516cea58


Re: INFO: task hung in __get_super

2019-04-30 Thread Tetsuo Handa
On 2019/05/01 0:07, Jan Kara wrote:
> Ah, right. I've missed that in your patch. So your patch should be really
> fixing the problem. Will you post it officially? Thanks!

I still cannot understand what the problem is.
According to console output,

--
INFO: task syz-executor274:8097 blocked for more than 143 seconds. 
INFO: task blkid:8099 blocked for more than 143 seconds. 

1 lock held by syz-executor274/8083:
2 locks held by syz-executor274/8097:
 #0: 7a5ed526 (>bd_mutex){+.+.}, at: blkdev_reread_part+0x1f/0x40 
block/ioctl.c:192
 #1: 67606e21 (>s_umount_key#39){.+.+}, at: 
__get_super.part.0+0x203/0x2e0 fs/super.c:788
1 lock held by blkid/8099:
 #0: 7a5ed526 (>bd_mutex){+.+.}, at: blkdev_put+0x34/0x560 
fs/block_dev.c:1866 
--

8099 was blocked for too long waiting for 7a5ed526 held by 8097.
8097 was blocked for too long waiting for 67606e21 held by somebody.
Since there is nobody else holding 67606e21,
I guessed that the "somebody" which is holding 67606e21 is 8083.

--
[ 1107.337625][C1] CPU: 1 PID: 8083 Comm: syz-executor274 Not tainted 
5.1.0-rc6+ #89
[ 1107.337631][C1] Hardware name: Google Google Compute Engine/Google 
Compute Engine, BIOS Google 01/01/2011
[ 1107.337636][C1] RIP: 0010:debug_lockdep_rcu_enabled.part.0+0xb/0x60
[ 1107.337648][C1] Code: 5b 5d c3 e8 67 71 e5 ff 0f 1f 80 00 00 00 00 55 48 
89 e5 e8 37 ff ff ff 5d c3 0f 1f 44 00 00 48 b8 00 00 00 00 00 fc ff df 55 <48> 
89 e5 53 65 48 8b 1c 25 00 ee 01 00 48 8d bb 7c 08 00 00 48 89
[ 1107.337652][C1] RSP: 0018:8880a85274c8 EFLAGS: 0202
[ 1107.337661][C1] RAX: dc00 RBX: 8880a85275d8 RCX: 
112bcd63
[ 1107.337666][C1] RDX:  RSI: 870d8f3c RDI: 
8880a85275e0
[ 1107.337672][C1] RBP: 8880a85274d8 R08: 888081e68540 R09: 
ed1015d25bc8
[ 1107.337677][C1] R10: ed1015d25bc7 R11: 8880ae92de3b R12: 

[ 1107.337683][C1] R13: 8880a694d640 R14: 88809541b942 R15: 
0006
[ 1107.337689][C1] FS:  00e0b880() GS:8880ae90() 
knlGS:
[ 1107.337693][C1] CS:  0010 DS:  ES:  CR0: 80050033
[ 1107.337699][C1] CR2: ff600400 CR3: 92d6f000 CR4: 
001406e0
[ 1107.337704][C1] DR0:  DR1:  DR2: 

[ 1107.337710][C1] DR3:  DR6: fffe0ff0 DR7: 
0400
[ 1107.337713][C1] Call Trace:
[ 1107.337717][C1]  ? debug_lockdep_rcu_enabled+0x71/0xa0
[ 1107.337721][C1]  xas_descend+0xbf/0x370
[ 1107.337724][C1]  xas_load+0xef/0x150
[ 1107.337728][C1]  find_get_entry+0x13d/0x880
[ 1107.337733][C1]  ? find_get_entries_tag+0xc10/0xc10
[ 1107.337736][C1]  ? mark_held_locks+0xa4/0xf0
[ 1107.337741][C1]  ? pagecache_get_page+0x1a8/0x740
[ 1107.337745][C1]  pagecache_get_page+0x4c/0x740
[ 1107.337749][C1]  __getblk_gfp+0x27e/0x970
[ 1107.337752][C1]  __bread_gfp+0x2f/0x300
[ 1107.337756][C1]  udf_tread+0xf1/0x140
[ 1107.337760][C1]  udf_read_tagged+0x50/0x530
[ 1107.337764][C1]  udf_check_anchor_block+0x1ef/0x680
[ 1107.337768][C1]  ? blkpg_ioctl+0xa90/0xa90
[ 1107.337772][C1]  ? udf_process_sequence+0x35d0/0x35d0
[ 1107.337776][C1]  ? submit_bio+0xba/0x480
[ 1107.337780][C1]  udf_scan_anchors+0x3f4/0x680
[ 1107.337784][C1]  ? udf_check_anchor_block+0x680/0x680
[ 1107.337789][C1]  ? __sanitizer_cov_trace_const_cmp8+0x18/0x20
[ 1107.337793][C1]  ? udf_get_last_session+0x120/0x120
[ 1107.337797][C1]  udf_load_vrs+0x67f/0xc80
[ 1107.337801][C1]  ? udf_scan_anchors+0x680/0x680
[ 1107.337805][C1]  ? udf_bread+0x260/0x260
[ 1107.337809][C1]  ? lockdep_init_map+0x1be/0x6d0
[ 1107.337813][C1]  udf_fill_super+0x7d8/0x16d1
[ 1107.337817][C1]  ? udf_load_vrs+0xc80/0xc80
[ 1107.337820][C1]  ? vsprintf+0x40/0x40
[ 1107.337824][C1]  ? set_blocksize+0x2bf/0x340
[ 1107.337829][C1]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
[ 1107.337833][C1]  mount_bdev+0x307/0x3c0
[ 1107.337837][C1]  ? udf_load_vrs+0xc80/0xc80
[ 1107.337840][C1]  udf_mount+0x35/0x40
[ 1107.337844][C1]  ? udf_get_pblock_meta25+0x3a0/0x3a0
[ 1107.337848][C1]  legacy_get_tree+0xf2/0x200
[ 1107.337853][C1]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
[ 1107.337857][C1]  vfs_get_tree+0x123/0x450
[ 1107.337860][C1]  do_mount+0x1436/0x2c40
[ 1107.337864][C1]  ? copy_mount_string+0x40/0x40
[ 1107.337868][C1]  ? _copy_from_user+0xdd/0x150
[ 1107.337873][C1]  ? __sanitizer_cov_trace_const_cmp8+0x18/0x20
[ 1107.337877][C1]  ? copy_mount_options+0x280/0x3a0
[ 1107.337881][C1]  ksys_mount+0xdb/0x150
[ 1107.337885][C1]  __x64_sys_mount+0xbe/0x150
[ 1107.337889][C1]  do_syscall_64+0x103/0x610
[ 1107.337893][C1]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
--


Re: INFO: task hung in __get_super

2019-04-30 Thread Jan Kara
On Tue 30-04-19 14:18:21, Al Viro wrote:
> On Tue, Apr 30, 2019 at 03:07:39PM +0200, Jan Kara wrote:
> > On Tue 30-04-19 04:11:44, Al Viro wrote:
> > > On Tue, Apr 30, 2019 at 04:55:01AM +0200, Jan Kara wrote:
> > > 
> > > > Yeah, you're right. And if we push the patch a bit further to not take
> > > > loop_ctl_mutex for invalid ioctl number, that would fix the problem. I
> > > > can send a fix.
> > > 
> > > Huh?  We don't take it until in lo_simple_ioctl(), and that patch doesn't
> > > get to its call on invalid ioctl numbers.  What am I missing here?
> > 
> > Doesn't it? blkdev_ioctl() calls into __blkdev_driver_ioctl() for
> > unrecognized ioctl numbers. That calls into lo_ioctl() in case of a loop
> > device. lo_ioctl() calls into lo_simple_ioctl() for ioctl numbers it
> > doesn't recognize and lo_simple_ioctl() will lock loop_ctl_mutex as you
> > say.
> 
> Not with the patch upthread.  lo_ioctl() part was
> 
> @@ -1567,10 +1564,9 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
> mode,
>   case LOOP_SET_BLOCK_SIZE:
>   if (!(mode & FMODE_WRITE) && !capable(CAP_SYS_ADMIN))
>   return -EPERM;
> - /* Fall through */
> + return lo_simple_ioctl(lo, cmd, arg);
>   default:
> - err = lo_simple_ioctl(lo, cmd, arg);
> - break;
> + return -EINVAL;
>   }
>  
>   return err;
> 
> so anything unrecognized doesn't make it to lo_simple_ioctl() at all.

Ah, right. I've missed that in your patch. So your patch should be really
fixing the problem. Will you post it officially? Thanks!

Honza
-- 
Jan Kara 
SUSE Labs, CR


Re: INFO: task hung in __get_super

2019-04-30 Thread Al Viro
On Tue, Apr 30, 2019 at 03:07:39PM +0200, Jan Kara wrote:
> On Tue 30-04-19 04:11:44, Al Viro wrote:
> > On Tue, Apr 30, 2019 at 04:55:01AM +0200, Jan Kara wrote:
> > 
> > > Yeah, you're right. And if we push the patch a bit further to not take
> > > loop_ctl_mutex for invalid ioctl number, that would fix the problem. I
> > > can send a fix.
> > 
> > Huh?  We don't take it until in lo_simple_ioctl(), and that patch doesn't
> > get to its call on invalid ioctl numbers.  What am I missing here?
> 
> Doesn't it? blkdev_ioctl() calls into __blkdev_driver_ioctl() for
> unrecognized ioctl numbers. That calls into lo_ioctl() in case of a loop
> device. lo_ioctl() calls into lo_simple_ioctl() for ioctl numbers it
> doesn't recognize and lo_simple_ioctl() will lock loop_ctl_mutex as you
> say.

Not with the patch upthread.  lo_ioctl() part was

@@ -1567,10 +1564,9 @@ static int lo_ioctl(struct block_device *bdev, fmode_t 
mode,
case LOOP_SET_BLOCK_SIZE:
if (!(mode & FMODE_WRITE) && !capable(CAP_SYS_ADMIN))
return -EPERM;
-   /* Fall through */
+   return lo_simple_ioctl(lo, cmd, arg);
default:
-   err = lo_simple_ioctl(lo, cmd, arg);
-   break;
+   return -EINVAL;
}
 
return err;

so anything unrecognized doesn't make it to lo_simple_ioctl() at all.


Re: INFO: task hung in __get_super

2019-04-30 Thread Jan Kara
On Tue 30-04-19 04:11:44, Al Viro wrote:
> On Tue, Apr 30, 2019 at 04:55:01AM +0200, Jan Kara wrote:
> 
> > Yeah, you're right. And if we push the patch a bit further to not take
> > loop_ctl_mutex for invalid ioctl number, that would fix the problem. I
> > can send a fix.
> 
> Huh?  We don't take it until in lo_simple_ioctl(), and that patch doesn't
> get to its call on invalid ioctl numbers.  What am I missing here?

Doesn't it? blkdev_ioctl() calls into __blkdev_driver_ioctl() for
unrecognized ioctl numbers. That calls into lo_ioctl() in case of a loop
device. lo_ioctl() calls into lo_simple_ioctl() for ioctl numbers it
doesn't recognize and lo_simple_ioctl() will lock loop_ctl_mutex as you
say.

Honza
-- 
Jan Kara 
SUSE Labs, CR


Re: INFO: task hung in __get_super

2019-04-29 Thread Al Viro
On Tue, Apr 30, 2019 at 04:55:01AM +0200, Jan Kara wrote:

> Yeah, you're right. And if we push the patch a bit further to not take
> loop_ctl_mutex for invalid ioctl number, that would fix the problem. I
> can send a fix.

Huh?  We don't take it until in lo_simple_ioctl(), and that patch doesn't
get to its call on invalid ioctl numbers.  What am I missing here?


Re: INFO: task hung in __get_super

2019-04-29 Thread Jan Kara
On Sun 28-04-19 19:51:09, Al Viro wrote:
> On Sun, Apr 28, 2019 at 11:14:06AM -0700, syzbot wrote:
> >  down_read+0x49/0x90 kernel/locking/rwsem.c:26
> >  __get_super.part.0+0x203/0x2e0 fs/super.c:788
> >  __get_super include/linux/spinlock.h:329 [inline]
> >  get_super+0x2e/0x50 fs/super.c:817
> >  fsync_bdev+0x19/0xd0 fs/block_dev.c:525
> >  invalidate_partition+0x36/0x60 block/genhd.c:1581
> >  drop_partitions block/partition-generic.c:443 [inline]
> >  rescan_partitions+0xef/0xa20 block/partition-generic.c:516
> >  __blkdev_reread_part+0x1a2/0x230 block/ioctl.c:173
> >  blkdev_reread_part+0x27/0x40 block/ioctl.c:193
> >  loop_reread_partitions+0x1c/0x40 drivers/block/loop.c:633
> >  loop_set_status+0xe57/0x1380 drivers/block/loop.c:1296
> >  loop_set_status64+0xc2/0x120 drivers/block/loop.c:1416
> >  lo_ioctl+0x8fc/0x2150 drivers/block/loop.c:1559
> >  __blkdev_driver_ioctl block/ioctl.c:303 [inline]
> >  blkdev_ioctl+0x6f2/0x1d10 block/ioctl.c:605
> >  block_ioctl+0xee/0x130 fs/block_dev.c:1933
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> ioctl(..., BLKRRPART) blocked on ->s_umount in __get_super().
> The trouble is, the only things holding ->s_umount appears to be
> these:
> 
> > 2 locks held by syz-executor274/11716:
> >  #0: a19e2025 (>s_umount_key#38/1){+.+.}, at:
> > alloc_super+0x158/0x890 fs/super.c:228
> >  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl
> > drivers/block/loop.c:1514 [inline]
> >  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150
> > drivers/block/loop.c:1572
> 
> > 2 locks held by syz-executor274/11717:
> >  #0: e185c083 (>s_umount_key#38/1){+.+.}, at:
> > alloc_super+0x158/0x890 fs/super.c:228
> >  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl
> > drivers/block/loop.c:1514 [inline]
> >  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150
> > drivers/block/loop.c:1572
> 
> ... and that's bollocks.  ->s_umount held there is that on freshly allocated
> superblock.  It *MUST* be in mount(2); no other syscall should be able to
> call alloc_super() in the first place.  So what the hell is that doing
> trying to call lo_ioctl() inside mount(2)?  Something like isofs attempting
> cdrom ioctls on the underlying device?

Actually UDF also calls CDROMMULTISESSION ioctl during mount. So I could
see how we get to lo_simple_ioctl() and indeed that would acquire
loop_ctl_mutex under s_umount which is the other way around than in
BLKRRPART ioctl. 

> Why do we have loop_func_table->ioctl(), BTW?  All in-tree instances are
> either NULL or return -EINVAL unconditionally.  Considering that the
> caller is
> err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
> we could bloody well just get rid of cryptoloop_ioctl() (the only
> non-NULL instance) and get rid of calling lo_simple_ioctl() in
> lo_ioctl() switch's default.

Yeah, you're right. And if we push the patch a bit further to not take
loop_ctl_mutex for invalid ioctl number, that would fix the problem. I
can send a fix.

Honza

> 
> Something like this:
> 
> diff --git a/drivers/block/cryptoloop.c b/drivers/block/cryptoloop.c
> index 254ee7d54e91..f16468a562f5 100644
> --- a/drivers/block/cryptoloop.c
> +++ b/drivers/block/cryptoloop.c
> @@ -167,12 +167,6 @@ cryptoloop_transfer(struct loop_device *lo, int cmd,
>  }
>  
>  static int
> -cryptoloop_ioctl(struct loop_device *lo, int cmd, unsigned long arg)
> -{
> - return -EINVAL;
> -}
> -
> -static int
>  cryptoloop_release(struct loop_device *lo)
>  {
>   struct crypto_sync_skcipher *tfm = lo->key_data;
> @@ -188,7 +182,6 @@ cryptoloop_release(struct loop_device *lo)
>  static struct loop_func_table cryptoloop_funcs = {
>   .number = LO_CRYPT_CRYPTOAPI,
>   .init = cryptoloop_init,
> - .ioctl = cryptoloop_ioctl,
>   .transfer = cryptoloop_transfer,
>   .release = cryptoloop_release,
>   .owner = THIS_MODULE
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index bf1c61cab8eb..2ec162b80562 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -955,7 +955,6 @@ static int loop_set_fd(struct loop_device *lo, fmode_t 
> mode,
>   lo->lo_flags = lo_flags;
>   lo->lo_backing_file = file;
>   lo->transfer = NULL;
> - lo->ioctl = NULL;
>   lo->lo_sizelimit = 0;
>   lo->old_gfp_mask = mapping_gfp_mask(mapping);
>   mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS));
> @@ -1064,7 +1063,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
> release)
>  
>   

Re: INFO: task hung in __get_super

2019-04-28 Thread Dmitry Vyukov
On Sun, Apr 28, 2019 at 8:51 PM Al Viro  wrote:
>
> On Sun, Apr 28, 2019 at 11:14:06AM -0700, syzbot wrote:
> >  down_read+0x49/0x90 kernel/locking/rwsem.c:26
> >  __get_super.part.0+0x203/0x2e0 fs/super.c:788
> >  __get_super include/linux/spinlock.h:329 [inline]
> >  get_super+0x2e/0x50 fs/super.c:817
> >  fsync_bdev+0x19/0xd0 fs/block_dev.c:525
> >  invalidate_partition+0x36/0x60 block/genhd.c:1581
> >  drop_partitions block/partition-generic.c:443 [inline]
> >  rescan_partitions+0xef/0xa20 block/partition-generic.c:516
> >  __blkdev_reread_part+0x1a2/0x230 block/ioctl.c:173
> >  blkdev_reread_part+0x27/0x40 block/ioctl.c:193
> >  loop_reread_partitions+0x1c/0x40 drivers/block/loop.c:633
> >  loop_set_status+0xe57/0x1380 drivers/block/loop.c:1296
> >  loop_set_status64+0xc2/0x120 drivers/block/loop.c:1416
> >  lo_ioctl+0x8fc/0x2150 drivers/block/loop.c:1559
> >  __blkdev_driver_ioctl block/ioctl.c:303 [inline]
> >  blkdev_ioctl+0x6f2/0x1d10 block/ioctl.c:605
> >  block_ioctl+0xee/0x130 fs/block_dev.c:1933
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> ioctl(..., BLKRRPART) blocked on ->s_umount in __get_super().
> The trouble is, the only things holding ->s_umount appears to be
> these:
>
> > 2 locks held by syz-executor274/11716:
> >  #0: a19e2025 (>s_umount_key#38/1){+.+.}, at:
> > alloc_super+0x158/0x890 fs/super.c:228
> >  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl
> > drivers/block/loop.c:1514 [inline]
> >  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150
> > drivers/block/loop.c:1572
>
> > 2 locks held by syz-executor274/11717:
> >  #0: e185c083 (>s_umount_key#38/1){+.+.}, at:
> > alloc_super+0x158/0x890 fs/super.c:228
> >  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl
> > drivers/block/loop.c:1514 [inline]
> >  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150
> > drivers/block/loop.c:1572
>
> ... and that's bollocks.  ->s_umount held there is that on freshly allocated
> superblock.  It *MUST* be in mount(2); no other syscall should be able to
> call alloc_super() in the first place.  So what the hell is that doing
> trying to call lo_ioctl() inside mount(2)?  Something like isofs attempting
> cdrom ioctls on the underlying device?


How useful would it be to see full stacks in such lockdep reports?
Now that we have lib/stackdepot.c that is capable of memorizing large
number of stacks and converting them to a single u32, we could use it
in more debug facilities. I remember +Peter mentioned some problems
with interrupts/reenterancy of stackdepot, but I hope it's resolvable
(at least in some conservative way as we already call stackdepot from
interrupts).
I think ODEBUG facility have the same problem of showing only single
PC in reports for a past stack.
Should I file an issue for this?


> Why do we have loop_func_table->ioctl(), BTW?  All in-tree instances are
> either NULL or return -EINVAL unconditionally.  Considering that the
> caller is
> err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
> we could bloody well just get rid of cryptoloop_ioctl() (the only
> non-NULL instance) and get rid of calling lo_simple_ioctl() in
> lo_ioctl() switch's default.
>
> Something like this:
>
> diff --git a/drivers/block/cryptoloop.c b/drivers/block/cryptoloop.c
> index 254ee7d54e91..f16468a562f5 100644
> --- a/drivers/block/cryptoloop.c
> +++ b/drivers/block/cryptoloop.c
> @@ -167,12 +167,6 @@ cryptoloop_transfer(struct loop_device *lo, int cmd,
>  }
>
>  static int
> -cryptoloop_ioctl(struct loop_device *lo, int cmd, unsigned long arg)
> -{
> -   return -EINVAL;
> -}
> -
> -static int
>  cryptoloop_release(struct loop_device *lo)
>  {
> struct crypto_sync_skcipher *tfm = lo->key_data;
> @@ -188,7 +182,6 @@ cryptoloop_release(struct loop_device *lo)
>  static struct loop_func_table cryptoloop_funcs = {
> .number = LO_CRYPT_CRYPTOAPI,
> .init = cryptoloop_init,
> -   .ioctl = cryptoloop_ioctl,
> .transfer = cryptoloop_transfer,
> .release = cryptoloop_release,
> .owner = THIS_MODULE
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index bf1c61cab8eb..2ec162b80562 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -955,7 +955,6 @@ static int loop_set_fd(struct loop_device *lo, fmode_t 
> mode,
> lo->lo_flags = lo_flags;
> lo->lo_backing_file = file;
> lo->transfer = NULL;
> -   lo->ioctl = NULL;
> lo->lo_sizelimit = 0;
> lo->old_gfp_mask = mapping_gfp_mask(mapping);
> mapping_set_gfp_mask(mapping, 

Re: INFO: task hung in __get_super

2019-04-28 Thread Tetsuo Handa
On 2019/04/29 3:51, Al Viro wrote:
> ioctl(..., BLKRRPART) blocked on ->s_umount in __get_super().
> The trouble is, the only things holding ->s_umount appears to be
> these:

Not always true. lockdep_print_held_locks() from debug_show_all_locks() can not
report locks held by TASK_RUNNING threads. Due to enabling 
CONFIG_PRINTK_CALLER=y,
the output from trigger_all_cpu_backtrace() is no longer included into the 
report
file (i.e. premature truncation) because NMI backtrace is printed from a 
different
printk() context. If we check the console output, we can understand that

>> 1 lock held by syz-executor274/8083:

was doing mount(2). Since there is a possibility that that thread was looping 
for
many seconds enough to trigger khungtaskd warnings, we can't tell whether this 
is
a locking dependency problem.


[ 1107.252933][ T1041] NMI backtrace for cpu 0
[ 1107.257402][ T1041] CPU: 0 PID: 1041 Comm: khungtaskd Not tainted 5.1.0-rc6+ 
#89
[ 1107.264960][ T1041] Hardware name: Google Google Compute Engine/Google 
Compute Engine, BIOS Google 01/01/2011
[ 1107.275380][ T1041] Call Trace:
[ 1107.278691][ T1041]  dump_stack+0x172/0x1f0
[ 1107.283216][ T1041]  nmi_cpu_backtrace.cold+0x63/0xa4
[ 1107.288469][ T1041]  ? lapic_can_unplug_cpu.cold+0x38/0x38
[ 1107.294155][ T1041]  nmi_trigger_cpumask_backtrace+0x1be/0x236
[ 1107.300256][ T1041]  arch_trigger_cpumask_backtrace+0x14/0x20
[ 1107.306174][ T1041]  watchdog+0x9b7/0xec0
[ 1107.310362][ T1041]  kthread+0x357/0x430
[ 1107.314446][ T1041]  ? reset_hung_task_detector+0x30/0x30
[ 1107.320016][ T1041]  ? kthread_cancel_delayed_work_sync+0x20/0x20
[ 1107.326280][ T1041]  ret_from_fork+0x3a/0x50
[ 1107.331403][ T1041] Sending NMI from CPU 0 to CPUs 1:
[ 1107.337617][C1] NMI backtrace for cpu 1
[ 1107.337625][C1] CPU: 1 PID: 8083 Comm: syz-executor274 Not tainted 
5.1.0-rc6+ #89
[ 1107.337631][C1] Hardware name: Google Google Compute Engine/Google 
Compute Engine, BIOS Google 01/01/2011
[ 1107.337636][C1] RIP: 0010:debug_lockdep_rcu_enabled.part.0+0xb/0x60
[ 1107.337648][C1] Code: 5b 5d c3 e8 67 71 e5 ff 0f 1f 80 00 00 00 00 55 48 
89 e5 e8 37 ff ff ff 5d c3 0f 1f 44 00 00 48 b8 00 00 00 00 00 fc ff df 55 <48> 
89 e5 53 65 48 8b 1c 25 00 ee 01 00 48 8d bb 7c 08 00 00 48 89
[ 1107.337652][C1] RSP: 0018:8880a85274c8 EFLAGS: 0202
[ 1107.337661][C1] RAX: dc00 RBX: 8880a85275d8 RCX: 
112bcd63
[ 1107.337666][C1] RDX:  RSI: 870d8f3c RDI: 
8880a85275e0
[ 1107.337672][C1] RBP: 8880a85274d8 R08: 888081e68540 R09: 
ed1015d25bc8
[ 1107.337677][C1] R10: ed1015d25bc7 R11: 8880ae92de3b R12: 

[ 1107.337683][C1] R13: 8880a694d640 R14: 88809541b942 R15: 
0006
[ 1107.337689][C1] FS:  00e0b880() GS:8880ae90() 
knlGS:
[ 1107.337693][C1] CS:  0010 DS:  ES:  CR0: 80050033
[ 1107.337699][C1] CR2: ff600400 CR3: 92d6f000 CR4: 
001406e0
[ 1107.337704][C1] DR0:  DR1:  DR2: 

[ 1107.337710][C1] DR3:  DR6: fffe0ff0 DR7: 
0400
[ 1107.337713][C1] Call Trace:
[ 1107.337717][C1]  ? debug_lockdep_rcu_enabled+0x71/0xa0
[ 1107.337721][C1]  xas_descend+0xbf/0x370
[ 1107.337724][C1]  xas_load+0xef/0x150
[ 1107.337728][C1]  find_get_entry+0x13d/0x880
[ 1107.337733][C1]  ? find_get_entries_tag+0xc10/0xc10
[ 1107.337736][C1]  ? mark_held_locks+0xa4/0xf0
[ 1107.337741][C1]  ? pagecache_get_page+0x1a8/0x740
[ 1107.337745][C1]  pagecache_get_page+0x4c/0x740
[ 1107.337749][C1]  __getblk_gfp+0x27e/0x970
[ 1107.337752][C1]  __bread_gfp+0x2f/0x300
[ 1107.337756][C1]  udf_tread+0xf1/0x140
[ 1107.337760][C1]  udf_read_tagged+0x50/0x530
[ 1107.337764][C1]  udf_check_anchor_block+0x1ef/0x680
[ 1107.337768][C1]  ? blkpg_ioctl+0xa90/0xa90
[ 1107.337772][C1]  ? udf_process_sequence+0x35d0/0x35d0
[ 1107.337776][C1]  ? submit_bio+0xba/0x480
[ 1107.337780][C1]  udf_scan_anchors+0x3f4/0x680
[ 1107.337784][C1]  ? udf_check_anchor_block+0x680/0x680
[ 1107.337789][C1]  ? __sanitizer_cov_trace_const_cmp8+0x18/0x20
[ 1107.337793][C1]  ? udf_get_last_session+0x120/0x120
[ 1107.337797][C1]  udf_load_vrs+0x67f/0xc80
[ 1107.337801][C1]  ? udf_scan_anchors+0x680/0x680
[ 1107.337805][C1]  ? udf_bread+0x260/0x260
[ 1107.337809][C1]  ? lockdep_init_map+0x1be/0x6d0
[ 1107.337813][C1]  udf_fill_super+0x7d8/0x16d1
[ 1107.337817][C1]  ? udf_load_vrs+0xc80/0xc80
[ 1107.337820][C1]  ? vsprintf+0x40/0x40
[ 1107.337824][C1]  ? set_blocksize+0x2bf/0x340
[ 1107.337829][C1]  ? __sanitizer_cov_trace_const_cmp4+0x16/0x20
[ 1107.337833][C1]  mount_bdev+0x307/0x3c0
[ 1107.337837][C1]  ? udf_load_vrs+0xc80/0xc80
[ 1107.337840][C1]  

Re: INFO: task hung in __get_super

2019-04-28 Thread Al Viro
On Sun, Apr 28, 2019 at 11:14:06AM -0700, syzbot wrote:
>  down_read+0x49/0x90 kernel/locking/rwsem.c:26
>  __get_super.part.0+0x203/0x2e0 fs/super.c:788
>  __get_super include/linux/spinlock.h:329 [inline]
>  get_super+0x2e/0x50 fs/super.c:817
>  fsync_bdev+0x19/0xd0 fs/block_dev.c:525
>  invalidate_partition+0x36/0x60 block/genhd.c:1581
>  drop_partitions block/partition-generic.c:443 [inline]
>  rescan_partitions+0xef/0xa20 block/partition-generic.c:516
>  __blkdev_reread_part+0x1a2/0x230 block/ioctl.c:173
>  blkdev_reread_part+0x27/0x40 block/ioctl.c:193
>  loop_reread_partitions+0x1c/0x40 drivers/block/loop.c:633
>  loop_set_status+0xe57/0x1380 drivers/block/loop.c:1296
>  loop_set_status64+0xc2/0x120 drivers/block/loop.c:1416
>  lo_ioctl+0x8fc/0x2150 drivers/block/loop.c:1559
>  __blkdev_driver_ioctl block/ioctl.c:303 [inline]
>  blkdev_ioctl+0x6f2/0x1d10 block/ioctl.c:605
>  block_ioctl+0xee/0x130 fs/block_dev.c:1933
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  file_ioctl fs/ioctl.c:509 [inline]
>  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe

ioctl(..., BLKRRPART) blocked on ->s_umount in __get_super().
The trouble is, the only things holding ->s_umount appears to be
these:

> 2 locks held by syz-executor274/11716:
>  #0: a19e2025 (>s_umount_key#38/1){+.+.}, at:
> alloc_super+0x158/0x890 fs/super.c:228
>  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl
> drivers/block/loop.c:1514 [inline]
>  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150
> drivers/block/loop.c:1572

> 2 locks held by syz-executor274/11717:
>  #0: e185c083 (>s_umount_key#38/1){+.+.}, at:
> alloc_super+0x158/0x890 fs/super.c:228
>  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_simple_ioctl
> drivers/block/loop.c:1514 [inline]
>  #1: bde6230e (loop_ctl_mutex){+.+.}, at: lo_ioctl+0x266/0x2150
> drivers/block/loop.c:1572

... and that's bollocks.  ->s_umount held there is that on freshly allocated
superblock.  It *MUST* be in mount(2); no other syscall should be able to
call alloc_super() in the first place.  So what the hell is that doing
trying to call lo_ioctl() inside mount(2)?  Something like isofs attempting
cdrom ioctls on the underlying device?

Why do we have loop_func_table->ioctl(), BTW?  All in-tree instances are
either NULL or return -EINVAL unconditionally.  Considering that the
caller is
err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
we could bloody well just get rid of cryptoloop_ioctl() (the only
non-NULL instance) and get rid of calling lo_simple_ioctl() in
lo_ioctl() switch's default.

Something like this:

diff --git a/drivers/block/cryptoloop.c b/drivers/block/cryptoloop.c
index 254ee7d54e91..f16468a562f5 100644
--- a/drivers/block/cryptoloop.c
+++ b/drivers/block/cryptoloop.c
@@ -167,12 +167,6 @@ cryptoloop_transfer(struct loop_device *lo, int cmd,
 }
 
 static int
-cryptoloop_ioctl(struct loop_device *lo, int cmd, unsigned long arg)
-{
-   return -EINVAL;
-}
-
-static int
 cryptoloop_release(struct loop_device *lo)
 {
struct crypto_sync_skcipher *tfm = lo->key_data;
@@ -188,7 +182,6 @@ cryptoloop_release(struct loop_device *lo)
 static struct loop_func_table cryptoloop_funcs = {
.number = LO_CRYPT_CRYPTOAPI,
.init = cryptoloop_init,
-   .ioctl = cryptoloop_ioctl,
.transfer = cryptoloop_transfer,
.release = cryptoloop_release,
.owner = THIS_MODULE
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index bf1c61cab8eb..2ec162b80562 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -955,7 +955,6 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
lo->lo_flags = lo_flags;
lo->lo_backing_file = file;
lo->transfer = NULL;
-   lo->ioctl = NULL;
lo->lo_sizelimit = 0;
lo->old_gfp_mask = mapping_gfp_mask(mapping);
mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS));
@@ -1064,7 +1063,6 @@ static int __loop_clr_fd(struct loop_device *lo, bool 
release)
 
loop_release_xfer(lo);
lo->transfer = NULL;
-   lo->ioctl = NULL;
lo->lo_device = NULL;
lo->lo_encryption = NULL;
lo->lo_offset = 0;
@@ -1262,7 +1260,6 @@ loop_set_status(struct loop_device *lo, const struct 
loop_info64 *info)
if (!xfer)
xfer = _funcs;
lo->transfer = xfer->transfer;
-   lo->ioctl = xfer->ioctl;
 
if ((lo->lo_flags & LO_FLAGS_AUTOCLEAR) !=
 (info->lo_flags & LO_FLAGS_AUTOCLEAR))
@@ -1525,7 +1522,7 @@ static int lo_simple_ioctl(struct loop_device *lo, 
unsigned int cmd,
err = loop_set_block_size(lo, arg);
break;
default:
- 

Re: INFO: task hung in __get_super

2019-04-28 Thread syzbot

syzbot has found a reproducer for the following crash on:

HEAD commit:037904a2 Merge branch 'x86-urgent-for-linus' of git://git...
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=135ff034a0
kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
dashboard link: https://syzkaller.appspot.com/bug?extid=10007d66ca02b08f0e60
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=1291b1f4a0
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=135385a8a0

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+10007d66ca02b08f0...@syzkaller.appspotmail.com

INFO: task syz-executor274:8097 blocked for more than 143 seconds.
  Not tainted 5.1.0-rc6+ #89
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor274 D28008  8097   8041 0x0004
Call Trace:
 context_switch kernel/sched/core.c:2877 [inline]
 __schedule+0x813/0x1cc0 kernel/sched/core.c:3518
 schedule+0x92/0x180 kernel/sched/core.c:3562
 __rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:285 [inline]
 rwsem_down_read_failed+0x213/0x420 kernel/locking/rwsem-xadd.c:302
 call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
 __down_read arch/x86/include/asm/rwsem.h:83 [inline]
 down_read+0x49/0x90 kernel/locking/rwsem.c:26
 __get_super.part.0+0x203/0x2e0 fs/super.c:788
 __get_super include/linux/spinlock.h:329 [inline]
 get_super+0x2e/0x50 fs/super.c:817
 fsync_bdev+0x19/0xd0 fs/block_dev.c:525
 invalidate_partition+0x36/0x60 block/genhd.c:1581
 drop_partitions block/partition-generic.c:443 [inline]
 rescan_partitions+0xef/0xa20 block/partition-generic.c:516
 __blkdev_reread_part+0x1a2/0x230 block/ioctl.c:173
 blkdev_reread_part+0x27/0x40 block/ioctl.c:193
 loop_reread_partitions+0x1c/0x40 drivers/block/loop.c:633
 loop_set_status+0xe57/0x1380 drivers/block/loop.c:1296
 loop_set_status64+0xc2/0x120 drivers/block/loop.c:1416
 lo_ioctl+0x8fc/0x2150 drivers/block/loop.c:1559
 __blkdev_driver_ioctl block/ioctl.c:303 [inline]
 blkdev_ioctl+0x6f2/0x1d10 block/ioctl.c:605
 block_ioctl+0xee/0x130 fs/block_dev.c:1933
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:509 [inline]
 do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
 __do_sys_ioctl fs/ioctl.c:720 [inline]
 __se_sys_ioctl fs/ioctl.c:718 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x441937
Code: 48 83 c4 08 48 89 d8 5b 5d c3 66 0f 1f 84 00 00 00 00 00 48 89 e8 48  
f7 d8 48 39 c3 0f 92 c0 eb 92 66 90 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 8d 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00

RSP: 002b:7ffcabed4e08 EFLAGS: 0202 ORIG_RAX: 0010
RAX: ffda RBX: 0003 RCX: 00441937
RDX: 7ffcabed4ea0 RSI: 4c04 RDI: 0004
RBP: 0003 R08:  R09: 000a
R10: 0075 R11: 0202 R12: 0001
R13: 0004 R14:  R15: 
INFO: task blkid:8099 blocked for more than 143 seconds.
  Not tainted 5.1.0-rc6+ #89
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
blkid   D27504  8099   8021 0x0004
Call Trace:
 context_switch kernel/sched/core.c:2877 [inline]
 __schedule+0x813/0x1cc0 kernel/sched/core.c:3518
 schedule+0x92/0x180 kernel/sched/core.c:3562
 schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3620
 __mutex_lock_common kernel/locking/mutex.c:1002 [inline]
 __mutex_lock+0x726/0x1310 kernel/locking/mutex.c:1072
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
 blkdev_put+0x34/0x560 fs/block_dev.c:1866
 blkdev_close+0x8b/0xb0 fs/block_dev.c:1915
 __fput+0x2e5/0x8d0 fs/file_table.c:278
 fput+0x16/0x20 fs/file_table.c:309
 task_work_run+0x14a/0x1c0 kernel/task_work.c:113
 tracehook_notify_resume include/linux/tracehook.h:188 [inline]
 exit_to_usermode_loop+0x273/0x2c0 arch/x86/entry/common.c:166
 prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
 do_syscall_64+0x52d/0x610 arch/x86/entry/common.c:293
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f1ae9c432b0
Code: Bad RIP value.
RSP: 002b:7ffc29ff6028 EFLAGS: 0246 ORIG_RAX: 0003
RAX:  RBX:  RCX: 7f1ae9c432b0
RDX:  RSI:  RDI: 0003
RBP:  R08: 0028 R09: 0168
R10:  R11: 0246 R12: 00b9d030
R13:  R14: 0003 R15: 0005

Showing all locks held in the system:
1 lock held by khungtaskd/1041:
 #0: 27887009 (rcu_read_lock){}, at:  
debug_show_all_locks+0x5f/0x27e 

Re: INFO: task hung in __get_super

2018-06-19 Thread Tetsuo Handa
Dmitry Vyukov wrote:
> On Tue, Jun 19, 2018 at 4:10 PM, Tetsuo Handa
>  wrote:
> > On 2018/06/19 20:53, Dmitry Vyukov wrote:
> >> On Tue, Jun 19, 2018 at 1:44 PM, Tetsuo Handa
> >>  wrote:
> >>> This bug report is getting no feedback, but I guess that this bug is in
> >>> block or mm or locking layer rather than fs layer.
> >>>
> >>> NMI backtrace for this bug tends to report that sb_bread() from 
> >>> fill_super()
> >>>  from mount_bdev() is stalling is the cause of keep holding s_umount_key 
> >>> for
> >>> more than 120 seconds. What is strange is that NMI backtrace for this bug 
> >>> tends
> >>> to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
> >>> rcu_read_unlock() which is expected not to stall.
> >>>
> >>> Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
> >>> CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
> >>> maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
> >>> give us some hints...
> >>
> >> If an rcu stall is the true root cause of this, then I guess would see
> >> "rcu stall" bug too. Rcu stall is detected after 120 seconds, but task
> >> hang after 120-240 seconds. So rcu stall has much higher chances to be
> >> detected. Do you see the corresponding "rcu stall" bug?
> >
> > RCU stall is detected after 125 seconds due to CONFIG_PROVE_RCU=y
> > (e.g. 
> > https://syzkaller.appspot.com/bug?id=1fac0fd91219f3f2a03d6fa7deafc95fbed79cc2
> >  ).
> >
> > I didn't find the corresponding "rcu stall" bug. But it is not required
> > that one RCU stall takes longer than 120 seconds.
> >
> > down(); // Will take 120 seconds due to multiple RCU stalls
> >   rcu_read_lock():
> >   do_something();
> >   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> >   rcu_read_lock():
> >   do_something();
> >   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> >   rcu_read_lock():
> >   do_something();
> >   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> >   rcu_read_lock():
> >   do_something();
> >   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> > up();
> 
> 
> You think this is another false positive?
> Like this one 
> https://github.com/google/syzkaller/issues/516#issuecomment-395685629
> ?

According to https://syzkaller.appspot.com/text?tag=CrashLog=11db16c440 
from
"INFO: rcu detected stall in __process_echoes":

[  859.630022] INFO: rcu_sched self-detected stall on CPU
[  859.635509]  0-: (124999 ticks this GP) idle=30e/1/4611686018427387906 
softirq=287964/287964 fqs=31234 
[  859.645716]   (t=125000 jiffies g=156333 c=156332 q=555)
(...snipped...)
[  860.20]  ? process_one_work+0x1ba0/0x1ba0
[  860.271135]  ? kthread_bind+0x40/0x40
[  860.274927]  ret_from_fork+0x3a/0x50
[  861.152252] INFO: task kworker/u4:2:59 blocked for more than 120 seconds.
[  861.159245]   Not tainted 4.18.0-rc1+ #109
[  861.163851] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.

RCU stall was reported immediately before khungtaskd fires. Since syzbot assigns
only 2 CPUs, it might not be rare case that a hung process was unable to run 
just
because somebody else kept occupying CPU resources.

Well, "BUG: soft lockup in __process_echoes" will be a dup of
"INFO: rcu detected stall in __process_echoes". I wonder why
softlockup detector waited for 135 seconds.

Well, "BUG: soft lockup in shrink_dcache_parent (2)" and
"BUG: soft lockup in snd_virmidi_output_trigger" and
"BUG: soft lockup in smp_call_function_many" and
"BUG: soft lockup in do_raw_spin_unlock (2)" as well waited for 134 seconds
while "BUG: soft lockup in d_walk" waited for only 22 seconds...

Anyway, I think that in some cases RCU stalls/soft lockups are the cause of 
hung tasks.


Re: INFO: task hung in __get_super

2018-06-19 Thread Tetsuo Handa
Dmitry Vyukov wrote:
> On Tue, Jun 19, 2018 at 4:10 PM, Tetsuo Handa
>  wrote:
> > On 2018/06/19 20:53, Dmitry Vyukov wrote:
> >> On Tue, Jun 19, 2018 at 1:44 PM, Tetsuo Handa
> >>  wrote:
> >>> This bug report is getting no feedback, but I guess that this bug is in
> >>> block or mm or locking layer rather than fs layer.
> >>>
> >>> NMI backtrace for this bug tends to report that sb_bread() from 
> >>> fill_super()
> >>>  from mount_bdev() is stalling is the cause of keep holding s_umount_key 
> >>> for
> >>> more than 120 seconds. What is strange is that NMI backtrace for this bug 
> >>> tends
> >>> to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
> >>> rcu_read_unlock() which is expected not to stall.
> >>>
> >>> Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
> >>> CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
> >>> maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
> >>> give us some hints...
> >>
> >> If an rcu stall is the true root cause of this, then I guess would see
> >> "rcu stall" bug too. Rcu stall is detected after 120 seconds, but task
> >> hang after 120-240 seconds. So rcu stall has much higher chances to be
> >> detected. Do you see the corresponding "rcu stall" bug?
> >
> > RCU stall is detected after 125 seconds due to CONFIG_PROVE_RCU=y
> > (e.g. 
> > https://syzkaller.appspot.com/bug?id=1fac0fd91219f3f2a03d6fa7deafc95fbed79cc2
> >  ).
> >
> > I didn't find the corresponding "rcu stall" bug. But it is not required
> > that one RCU stall takes longer than 120 seconds.
> >
> > down(); // Will take 120 seconds due to multiple RCU stalls
> >   rcu_read_lock():
> >   do_something();
> >   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> >   rcu_read_lock():
> >   do_something();
> >   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> >   rcu_read_lock():
> >   do_something();
> >   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> >   rcu_read_lock():
> >   do_something();
> >   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> > up();
> 
> 
> You think this is another false positive?
> Like this one 
> https://github.com/google/syzkaller/issues/516#issuecomment-395685629
> ?

According to https://syzkaller.appspot.com/text?tag=CrashLog=11db16c440 
from
"INFO: rcu detected stall in __process_echoes":

[  859.630022] INFO: rcu_sched self-detected stall on CPU
[  859.635509]  0-: (124999 ticks this GP) idle=30e/1/4611686018427387906 
softirq=287964/287964 fqs=31234 
[  859.645716]   (t=125000 jiffies g=156333 c=156332 q=555)
(...snipped...)
[  860.20]  ? process_one_work+0x1ba0/0x1ba0
[  860.271135]  ? kthread_bind+0x40/0x40
[  860.274927]  ret_from_fork+0x3a/0x50
[  861.152252] INFO: task kworker/u4:2:59 blocked for more than 120 seconds.
[  861.159245]   Not tainted 4.18.0-rc1+ #109
[  861.163851] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.

RCU stall was reported immediately before khungtaskd fires. Since syzbot assigns
only 2 CPUs, it might not be rare case that a hung process was unable to run 
just
because somebody else kept occupying CPU resources.

Well, "BUG: soft lockup in __process_echoes" will be a dup of
"INFO: rcu detected stall in __process_echoes". I wonder why
softlockup detector waited for 135 seconds.

Well, "BUG: soft lockup in shrink_dcache_parent (2)" and
"BUG: soft lockup in snd_virmidi_output_trigger" and
"BUG: soft lockup in smp_call_function_many" and
"BUG: soft lockup in do_raw_spin_unlock (2)" as well waited for 134 seconds
while "BUG: soft lockup in d_walk" waited for only 22 seconds...

Anyway, I think that in some cases RCU stalls/soft lockups are the cause of 
hung tasks.


Re: INFO: task hung in __get_super

2018-06-19 Thread Dmitry Vyukov
On Tue, Jun 19, 2018 at 4:10 PM, Tetsuo Handa
 wrote:
> On 2018/06/19 20:53, Dmitry Vyukov wrote:
>> On Tue, Jun 19, 2018 at 1:44 PM, Tetsuo Handa
>>  wrote:
>>> This bug report is getting no feedback, but I guess that this bug is in
>>> block or mm or locking layer rather than fs layer.
>>>
>>> NMI backtrace for this bug tends to report that sb_bread() from fill_super()
>>>  from mount_bdev() is stalling is the cause of keep holding s_umount_key for
>>> more than 120 seconds. What is strange is that NMI backtrace for this bug 
>>> tends
>>> to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
>>> rcu_read_unlock() which is expected not to stall.
>>>
>>> Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
>>> CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
>>> maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
>>> give us some hints...
>>
>> If an rcu stall is the true root cause of this, then I guess would see
>> "rcu stall" bug too. Rcu stall is detected after 120 seconds, but task
>> hang after 120-240 seconds. So rcu stall has much higher chances to be
>> detected. Do you see the corresponding "rcu stall" bug?
>
> RCU stall is detected after 125 seconds due to CONFIG_PROVE_RCU=y
> (e.g. 
> https://syzkaller.appspot.com/bug?id=1fac0fd91219f3f2a03d6fa7deafc95fbed79cc2 
> ).
>
> I didn't find the corresponding "rcu stall" bug. But it is not required
> that one RCU stall takes longer than 120 seconds.
>
> down(); // Will take 120 seconds due to multiple RCU stalls
>   rcu_read_lock():
>   do_something();
>   rcu_read_unlock(): // Took 30 seconds for unknown reason.
>   rcu_read_lock():
>   do_something();
>   rcu_read_unlock(): // Took 30 seconds for unknown reason.
>   rcu_read_lock():
>   do_something();
>   rcu_read_unlock(): // Took 30 seconds for unknown reason.
>   rcu_read_lock():
>   do_something();
>   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> up();


You think this is another false positive?
Like this one 
https://github.com/google/syzkaller/issues/516#issuecomment-395685629
?


Re: INFO: task hung in __get_super

2018-06-19 Thread Dmitry Vyukov
On Tue, Jun 19, 2018 at 4:10 PM, Tetsuo Handa
 wrote:
> On 2018/06/19 20:53, Dmitry Vyukov wrote:
>> On Tue, Jun 19, 2018 at 1:44 PM, Tetsuo Handa
>>  wrote:
>>> This bug report is getting no feedback, but I guess that this bug is in
>>> block or mm or locking layer rather than fs layer.
>>>
>>> NMI backtrace for this bug tends to report that sb_bread() from fill_super()
>>>  from mount_bdev() is stalling is the cause of keep holding s_umount_key for
>>> more than 120 seconds. What is strange is that NMI backtrace for this bug 
>>> tends
>>> to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
>>> rcu_read_unlock() which is expected not to stall.
>>>
>>> Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
>>> CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
>>> maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
>>> give us some hints...
>>
>> If an rcu stall is the true root cause of this, then I guess would see
>> "rcu stall" bug too. Rcu stall is detected after 120 seconds, but task
>> hang after 120-240 seconds. So rcu stall has much higher chances to be
>> detected. Do you see the corresponding "rcu stall" bug?
>
> RCU stall is detected after 125 seconds due to CONFIG_PROVE_RCU=y
> (e.g. 
> https://syzkaller.appspot.com/bug?id=1fac0fd91219f3f2a03d6fa7deafc95fbed79cc2 
> ).
>
> I didn't find the corresponding "rcu stall" bug. But it is not required
> that one RCU stall takes longer than 120 seconds.
>
> down(); // Will take 120 seconds due to multiple RCU stalls
>   rcu_read_lock():
>   do_something();
>   rcu_read_unlock(): // Took 30 seconds for unknown reason.
>   rcu_read_lock():
>   do_something();
>   rcu_read_unlock(): // Took 30 seconds for unknown reason.
>   rcu_read_lock():
>   do_something();
>   rcu_read_unlock(): // Took 30 seconds for unknown reason.
>   rcu_read_lock():
>   do_something();
>   rcu_read_unlock(): // Took 30 seconds for unknown reason.
> up();


You think this is another false positive?
Like this one 
https://github.com/google/syzkaller/issues/516#issuecomment-395685629
?


Re: INFO: task hung in __get_super

2018-06-19 Thread Tetsuo Handa
On 2018/06/19 20:53, Dmitry Vyukov wrote:
> On Tue, Jun 19, 2018 at 1:44 PM, Tetsuo Handa
>  wrote:
>> This bug report is getting no feedback, but I guess that this bug is in
>> block or mm or locking layer rather than fs layer.
>>
>> NMI backtrace for this bug tends to report that sb_bread() from fill_super()
>>  from mount_bdev() is stalling is the cause of keep holding s_umount_key for
>> more than 120 seconds. What is strange is that NMI backtrace for this bug 
>> tends
>> to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
>> rcu_read_unlock() which is expected not to stall.
>>
>> Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
>> CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
>> maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
>> give us some hints...
> 
> If an rcu stall is the true root cause of this, then I guess would see
> "rcu stall" bug too. Rcu stall is detected after 120 seconds, but task
> hang after 120-240 seconds. So rcu stall has much higher chances to be
> detected. Do you see the corresponding "rcu stall" bug?

RCU stall is detected after 125 seconds due to CONFIG_PROVE_RCU=y
(e.g. 
https://syzkaller.appspot.com/bug?id=1fac0fd91219f3f2a03d6fa7deafc95fbed79cc2 ).

I didn't find the corresponding "rcu stall" bug. But it is not required
that one RCU stall takes longer than 120 seconds.

down(); // Will take 120 seconds due to multiple RCU stalls
  rcu_read_lock():
  do_something();
  rcu_read_unlock(): // Took 30 seconds for unknown reason.
  rcu_read_lock():
  do_something();
  rcu_read_unlock(): // Took 30 seconds for unknown reason.
  rcu_read_lock():
  do_something();
  rcu_read_unlock(): // Took 30 seconds for unknown reason.
  rcu_read_lock():
  do_something();
  rcu_read_unlock(): // Took 30 seconds for unknown reason.
up();



Re: INFO: task hung in __get_super

2018-06-19 Thread Tetsuo Handa
On 2018/06/19 20:53, Dmitry Vyukov wrote:
> On Tue, Jun 19, 2018 at 1:44 PM, Tetsuo Handa
>  wrote:
>> This bug report is getting no feedback, but I guess that this bug is in
>> block or mm or locking layer rather than fs layer.
>>
>> NMI backtrace for this bug tends to report that sb_bread() from fill_super()
>>  from mount_bdev() is stalling is the cause of keep holding s_umount_key for
>> more than 120 seconds. What is strange is that NMI backtrace for this bug 
>> tends
>> to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
>> rcu_read_unlock() which is expected not to stall.
>>
>> Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
>> CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
>> maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
>> give us some hints...
> 
> If an rcu stall is the true root cause of this, then I guess would see
> "rcu stall" bug too. Rcu stall is detected after 120 seconds, but task
> hang after 120-240 seconds. So rcu stall has much higher chances to be
> detected. Do you see the corresponding "rcu stall" bug?

RCU stall is detected after 125 seconds due to CONFIG_PROVE_RCU=y
(e.g. 
https://syzkaller.appspot.com/bug?id=1fac0fd91219f3f2a03d6fa7deafc95fbed79cc2 ).

I didn't find the corresponding "rcu stall" bug. But it is not required
that one RCU stall takes longer than 120 seconds.

down(); // Will take 120 seconds due to multiple RCU stalls
  rcu_read_lock():
  do_something();
  rcu_read_unlock(): // Took 30 seconds for unknown reason.
  rcu_read_lock():
  do_something();
  rcu_read_unlock(): // Took 30 seconds for unknown reason.
  rcu_read_lock():
  do_something();
  rcu_read_unlock(): // Took 30 seconds for unknown reason.
  rcu_read_lock():
  do_something();
  rcu_read_unlock(): // Took 30 seconds for unknown reason.
up();



Re: INFO: task hung in __get_super

2018-06-19 Thread Dmitry Vyukov
On Tue, Jun 19, 2018 at 1:44 PM, Tetsuo Handa
 wrote:
> This bug report is getting no feedback, but I guess that this bug is in
> block or mm or locking layer rather than fs layer.
>
> NMI backtrace for this bug tends to report that sb_bread() from fill_super()
>  from mount_bdev() is stalling is the cause of keep holding s_umount_key for
> more than 120 seconds. What is strange is that NMI backtrace for this bug 
> tends
> to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
> rcu_read_unlock() which is expected not to stall.
>
> Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
> CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
> maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
> give us some hints...

If an rcu stall is the true root cause of this, then I guess would see
"rcu stall" bug too. Rcu stall is detected after 120 seconds, but task
hang after 120-240 seconds. So rcu stall has much higher chances to be
detected. Do you see the corresponding "rcu stall" bug?

But, yes, we need to tune all timeouts. There is
https://github.com/google/syzkaller/issues/516 for this.
We also need "kernel/hung_task.c: allow to set checking interval
separately from timeout" to be merged:
https://groups.google.com/forum/#!topic/syzkaller/rOr3WBE-POY
as currently it's very hard to tune task hung timeout. But maybe we
will need similar patches for other watchdogs too if they have the
same problem.


Re: INFO: task hung in __get_super

2018-06-19 Thread Dmitry Vyukov
On Tue, Jun 19, 2018 at 1:44 PM, Tetsuo Handa
 wrote:
> This bug report is getting no feedback, but I guess that this bug is in
> block or mm or locking layer rather than fs layer.
>
> NMI backtrace for this bug tends to report that sb_bread() from fill_super()
>  from mount_bdev() is stalling is the cause of keep holding s_umount_key for
> more than 120 seconds. What is strange is that NMI backtrace for this bug 
> tends
> to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
> rcu_read_unlock() which is expected not to stall.
>
> Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
> CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
> maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
> give us some hints...

If an rcu stall is the true root cause of this, then I guess would see
"rcu stall" bug too. Rcu stall is detected after 120 seconds, but task
hang after 120-240 seconds. So rcu stall has much higher chances to be
detected. Do you see the corresponding "rcu stall" bug?

But, yes, we need to tune all timeouts. There is
https://github.com/google/syzkaller/issues/516 for this.
We also need "kernel/hung_task.c: allow to set checking interval
separately from timeout" to be merged:
https://groups.google.com/forum/#!topic/syzkaller/rOr3WBE-POY
as currently it's very hard to tune task hung timeout. But maybe we
will need similar patches for other watchdogs too if they have the
same problem.


Re: INFO: task hung in __get_super

2018-06-19 Thread Tetsuo Handa
This bug report is getting no feedback, but I guess that this bug is in
block or mm or locking layer rather than fs layer.

NMI backtrace for this bug tends to report that sb_bread() from fill_super()
 from mount_bdev() is stalling is the cause of keep holding s_umount_key for
more than 120 seconds. What is strange is that NMI backtrace for this bug tends
to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
rcu_read_unlock() which is expected not to stall.

Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
give us some hints...



Re: INFO: task hung in __get_super

2018-06-19 Thread Tetsuo Handa
This bug report is getting no feedback, but I guess that this bug is in
block or mm or locking layer rather than fs layer.

NMI backtrace for this bug tends to report that sb_bread() from fill_super()
 from mount_bdev() is stalling is the cause of keep holding s_umount_key for
more than 120 seconds. What is strange is that NMI backtrace for this bug tends
to point at rcu_read_lock()/pagecache_get_page()/radix_tree_deref_slot()/
rcu_read_unlock() which is expected not to stall.

Since CONFIG_RCU_CPU_STALL_TIMEOUT is set to 120 (and actually +5 due to
CONFIG_PROVE_RCU=y) which is longer than CONFIG_DEFAULT_HUNG_TASK_TIMEOUT,
maybe setting CONFIG_RCU_CPU_STALL_TIMEOUT to smaller values (e.g. 25) can
give us some hints...



INFO: task hung in __get_super

2018-04-01 Thread syzbot

Hello,

syzbot hit the following crash on upstream commit
10b84daddbec72c6b440216a69de9a9605127f7a (Sat Mar 31 17:59:00 2018 +)
Merge branch 'perf-urgent-for-linus' of  
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=10007d66ca02b08f0e60


Unfortunately, I don't have any reproducer for this crash yet.
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=5899419228569600
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-2760467897697295172

compiler: gcc (GCC) 7.1.1 20170620

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+10007d66ca02b08f0...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
IPv6: ADDRCONF(NETDEV_UP): veth1: link is not ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
random: crng init done
INFO: task syz-executor3:13421 blocked for more than 120 seconds.
  Not tainted 4.16.0-rc7+ #9
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor3   D24672 13421   4481 0x0004
Call Trace:
 context_switch kernel/sched/core.c:2862 [inline]
 __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440
 schedule+0xf5/0x430 kernel/sched/core.c:3499
 __rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
 rwsem_down_read_failed+0x401/0x6e0 kernel/locking/rwsem-xadd.c:286
 call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
 __down_read arch/x86/include/asm/rwsem.h:83 [inline]
 down_read+0xa4/0x150 kernel/locking/rwsem.c:26
 __get_super.part.9+0x1d3/0x280 fs/super.c:663
 __get_super include/linux/spinlock.h:310 [inline]
 get_super+0x2d/0x40 fs/super.c:692
 fsync_bdev+0x19/0x80 fs/block_dev.c:468
 invalidate_partition+0x35/0x60 block/genhd.c:1566
 drop_partitions.isra.12+0xcd/0x1d0 block/partition-generic.c:440
 rescan_partitions+0x72/0x900 block/partition-generic.c:513
 __blkdev_reread_part+0x15f/0x1e0 block/ioctl.c:173
 blkdev_reread_part+0x26/0x40 block/ioctl.c:193
 loop_reread_partitions+0x12f/0x1a0 drivers/block/loop.c:619
 loop_set_status+0x9bb/0xf60 drivers/block/loop.c:1161
 loop_set_status64+0x9d/0x110 drivers/block/loop.c:1271
 lo_ioctl+0xd86/0x1b70 drivers/block/loop.c:1381
 __blkdev_driver_ioctl block/ioctl.c:303 [inline]
 blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
 block_ioctl+0xde/0x120 fs/block_dev.c:1875
 vfs_ioctl fs/ioctl.c:46 [inline]
 do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
 SYSC_ioctl fs/ioctl.c:701 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x454e79
RSP: 002b:7fda691eec68 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 7fda691ef6d4 RCX: 00454e79
RDX: 21c0 RSI: 4c04 RDI: 0013
RBP: 0072bea0 R08:  R09: 
R10:  R11: 0246 R12: 
R13: 0287 R14: 006f5d48 R15: 

Showing all locks held in the system:
2 locks held by khungtaskd/878:
 #0:  (rcu_read_lock){}, at: [<4cf2ddac>]  
check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
 #0:  (rcu_read_lock){}, at: [<4cf2ddac>] watchdog+0x1c5/0xd60  
kernel/hung_task.c:249
 #1:  (tasklist_lock){.+.+}, at: []  
debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470

2 locks held by getty/4404:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4405:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4406:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4407:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4408:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4409:
 #0:  (>ldisc_sem){}, at: []  

INFO: task hung in __get_super

2018-04-01 Thread syzbot

Hello,

syzbot hit the following crash on upstream commit
10b84daddbec72c6b440216a69de9a9605127f7a (Sat Mar 31 17:59:00 2018 +)
Merge branch 'perf-urgent-for-linus' of  
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
syzbot dashboard link:  
https://syzkaller.appspot.com/bug?extid=10007d66ca02b08f0e60


Unfortunately, I don't have any reproducer for this crash yet.
Raw console output:  
https://syzkaller.appspot.com/x/log.txt?id=5899419228569600
Kernel config:  
https://syzkaller.appspot.com/x/.config?id=-2760467897697295172

compiler: gcc (GCC) 7.1.1 20170620

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+10007d66ca02b08f0...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for  
details.

If you forward the report, please keep this part and the footer.

IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
IPv6: ADDRCONF(NETDEV_UP): veth1: link is not ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
random: crng init done
INFO: task syz-executor3:13421 blocked for more than 120 seconds.
  Not tainted 4.16.0-rc7+ #9
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor3   D24672 13421   4481 0x0004
Call Trace:
 context_switch kernel/sched/core.c:2862 [inline]
 __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440
 schedule+0xf5/0x430 kernel/sched/core.c:3499
 __rwsem_down_read_failed_common kernel/locking/rwsem-xadd.c:269 [inline]
 rwsem_down_read_failed+0x401/0x6e0 kernel/locking/rwsem-xadd.c:286
 call_rwsem_down_read_failed+0x18/0x30 arch/x86/lib/rwsem.S:94
 __down_read arch/x86/include/asm/rwsem.h:83 [inline]
 down_read+0xa4/0x150 kernel/locking/rwsem.c:26
 __get_super.part.9+0x1d3/0x280 fs/super.c:663
 __get_super include/linux/spinlock.h:310 [inline]
 get_super+0x2d/0x40 fs/super.c:692
 fsync_bdev+0x19/0x80 fs/block_dev.c:468
 invalidate_partition+0x35/0x60 block/genhd.c:1566
 drop_partitions.isra.12+0xcd/0x1d0 block/partition-generic.c:440
 rescan_partitions+0x72/0x900 block/partition-generic.c:513
 __blkdev_reread_part+0x15f/0x1e0 block/ioctl.c:173
 blkdev_reread_part+0x26/0x40 block/ioctl.c:193
 loop_reread_partitions+0x12f/0x1a0 drivers/block/loop.c:619
 loop_set_status+0x9bb/0xf60 drivers/block/loop.c:1161
 loop_set_status64+0x9d/0x110 drivers/block/loop.c:1271
 lo_ioctl+0xd86/0x1b70 drivers/block/loop.c:1381
 __blkdev_driver_ioctl block/ioctl.c:303 [inline]
 blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
 block_ioctl+0xde/0x120 fs/block_dev.c:1875
 vfs_ioctl fs/ioctl.c:46 [inline]
 do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
 SYSC_ioctl fs/ioctl.c:701 [inline]
 SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
RIP: 0033:0x454e79
RSP: 002b:7fda691eec68 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 7fda691ef6d4 RCX: 00454e79
RDX: 21c0 RSI: 4c04 RDI: 0013
RBP: 0072bea0 R08:  R09: 
R10:  R11: 0246 R12: 
R13: 0287 R14: 006f5d48 R15: 

Showing all locks held in the system:
2 locks held by khungtaskd/878:
 #0:  (rcu_read_lock){}, at: [<4cf2ddac>]  
check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
 #0:  (rcu_read_lock){}, at: [<4cf2ddac>] watchdog+0x1c5/0xd60  
kernel/hung_task.c:249
 #1:  (tasklist_lock){.+.+}, at: []  
debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470

2 locks held by getty/4404:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4405:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4406:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4407:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4408:
 #0:  (>ldisc_sem){}, at: []  
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
 #1:  (>atomic_read_lock){+.+.}, at: [<3da58a6e>]  
n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131

2 locks held by getty/4409:
 #0:  (>ldisc_sem){}, at: []