Hi,
在 2023/07/26 18:02, David Howells 写道:
Hi,
With 6.5-rc2 (6.5.0-0.rc2.20230721gitf7e3a1bafdea.20.fc39.x86_64), I'm seeing
a bunch of processes getting stuck in the D state on my desktop after a few
hours of reading email and compiling stuff. It's happened every day this week
so far and I managed to grab stack traces of the stuck processes this morning
(see attached).
There are two blockdevs involved below, /dev/md2 and /dev/md3. md3 is a raid1
array with two partitions with an ext4 partition on it. md2 is similar but
it's dm-crypted and ext4 is on top of that.
David
---
1015 ? D 0:04 [md2_raid1]
1074 ? D 0:00 [jbd2/sda6-8]
1138 ? D 0:00 [jbd2/md3-8]
1167 ? D 0:10 [dmcrypt_write/253:0]
1202 ? D 0:03 [jbd2/dm-0-8]
117547 ? D 5:12 [kworker/u16:8+flush-9:3]
121540 ? D 0:46 [kworker/u16:10+flush-253:0]
125431 pts/2 Dl+ 0:00 emacs .stgit-edit.txt
125469 ? D 0:00 /usr/libexec/nmh/rcvstore +kernel
===1015===
PID TTY STAT TIME COMMAND
1015 ? D 0:04 [md2_raid1]
[<0>] md_super_wait+0xa2/0xe0
[<0>] md_bitmap_daemon_work+0x183/0x3b0
[<0>] md_check_recovery+0x42/0x5a0
[<0>] raid1d+0x87/0x16f0 [raid1]
[<0>] md_thread+0xab/0x190
[<0>] kthread+0xe5/0x120
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1b/0x30
This means either the io to write super_block is stuck in underlying
disks or writing super_block is broken, I think it's probably the
former case. You'll need to locate where this io is now. If it can
be sure that there is no io in underlying disks, then this problem
is related to raid.
===1074===
PID TTY STAT TIME COMMAND
1074 ? D 0:00 [jbd2/sda6-8]
[<0>] jbd2_journal_commit_transaction+0x11a6/0x1a20
[<0>] kjournald2+0xad/0x280
[<0>] kthread+0xe5/0x120
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1b/0x30
===1138===
PID TTY STAT TIME COMMAND
1138 ? D 0:00 [jbd2/md3-8]
[<0>] jbd2_journal_commit_transaction+0x162d/0x1a20
[<0>] kjournald2+0xad/0x280
[<0>] kthread+0xe5/0x120
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1b/0x30
===1167===
PID TTY STAT TIME COMMAND
1167 ? D 0:10 [dmcrypt_write/253:0]
[<0>] md_super_wait+0xa2/0xe0
[<0>] md_bitmap_unplug+0xad/0x120
[<0>] flush_bio_list+0xf3/0x100 [raid1]
[<0>] raid1_unplug+0x3b/0xb0 [raid1]
[<0>] __blk_flush_plug+0xd8/0x160
[<0>] blk_finish_plug+0x29/0x40
[<0>] dmcrypt_write+0x132/0x140 [dm_crypt]
[<0>] kthread+0xe5/0x120
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1b/0x30
===1202===
PID TTY STAT TIME COMMAND
1202 ? D 0:03 [jbd2/dm-0-8]
[<0>] jbd2_journal_commit_transaction+0x162d/0x1a20
[<0>] kjournald2+0xad/0x280
[<0>] kthread+0xe5/0x120
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1b/0x30
===117547===
PID TTY STAT TIME COMMAND
117547 ? D 5:12 [kworker/u16:8+flush-9:3]
[<0>] blk_mq_get_tag+0x11e/0x2b0
Is this one of raid underlying disks? If so, this looks like io is stuck
in underlying disks.
Thanks,
Kuai
[<0>] __blk_mq_alloc_requests+0x1bc/0x350
[<0>] blk_mq_submit_bio+0x2c7/0x680
[<0>] __submit_bio+0x8b/0x170
[<0>] submit_bio_noacct_nocheck+0x159/0x370
[<0>] __block_write_full_folio+0x1e1/0x400
[<0>] writepage_cb+0x1a/0x70
[<0>] write_cache_pages+0x144/0x3b0
[<0>] do_writepages+0x164/0x1e0
[<0>] __writeback_single_inode+0x3d/0x360
[<0>] writeback_sb_inodes+0x1ed/0x4b0
[<0>] __writeback_inodes_wb+0x4c/0xf0
[<0>] wb_writeback+0x298/0x310
[<0>] wb_workfn+0x35b/0x510
[<0>] process_one_work+0x1de/0x3f0
[<0>] worker_thread+0x51/0x390
[<0>] kthread+0xe5/0x120
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1b/0x30
===121540===
PID TTY STAT TIME COMMAND
121540 ? D 0:46 [kworker/u16:10+flush-253:0]
[<0>] folio_wait_bit_common+0x13d/0x350
[<0>] mpage_prepare_extent_to_map+0x309/0x4d0
[<0>] ext4_do_writepages+0x25d/0xc90
[<0>] ext4_writepages+0xad/0x180
[<0>] do_writepages+0xcf/0x1e0
[<0>] __writeback_single_inode+0x3d/0x360
[<0>] writeback_sb_inodes+0x1ed/0x4b0
[<0>] __writeback_inodes_wb+0x4c/0xf0
[<0>] wb_writeback+0x298/0x310
[<0>] wb_workfn+0x35b/0x510
[<0>] process_one_work+0x1de/0x3f0
[<0>] worker_thread+0x51/0x390
[<0>] kthread+0xe5/0x120
[<0>] ret_from_fork+0x31/0x50
[<0>] ret_from_fork_asm+0x1b/0x30
===125431===
PID TTY STAT TIME COMMAND
125431 pts/2 Dl+ 0:00 emacs .stgit-edit.txt
[<0>] jbd2_log_wait_commit+0xd8/0x140
[<0>] ext4_sync_file+0x1cc/0x380
[<0>] __x64_sys_fsync+0x3b/0x70
[<0>] do_syscall_64+0x5d/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
===125469===
PID TTY STAT TIME COMMAND
125469 ? D 0:00 /usr/libexec/nmh/rcvstore +kernel
[<0>] folio_wait_bit_common+0x13d/0x350
[<0>] folio_wait_writeback+0x2c/0x90
[<0>] truncate_inode_partial_folio+0x5e/0x1a0
[<0>] truncate_inode_pages_range+0x1da/0x400
[<0>] truncate_pagecache+0x47/0x60
[<0>] ext4_setattr+0x685/0xba0
[<0>] notify_change+0x1e0/0x4a0
[<0>] do_truncate+0x98/0xf0
[<0>] do_sys_ftruncate+0x15c/0x1b0
[<0>] do_syscall_64+0x5d/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel
.
--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel