** Changed in: linux (Ubuntu Kinetic)
       Status: New => In Progress

** Changed in: linux (Ubuntu Kinetic)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Kinetic)
     Assignee: (unassigned) => gerald.yang (gerald-yang-tw)

** Description changed:

  [ Impact ]
  
  When running fio on a NVME on an AWS test instance with 5.19 kernel, IOs
  get stuck and fio never ends
  
  fio command:
  sudo fio --name=read_iops_test   --filename=/dev/nvme1n1 --filesize=50G   
--time_based --ramp_time=2s --runtime=1m   --ioengine=libaio --direct=1 
--verify=0 --randrepeat=0   --bs=16K --iodepth=256 --rw=randread
  read_iops_test: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 
16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=256
  fio-3.28
  Starting 1 process
  Jobs: 1 (f=0): [/(1)][-.-%][eta 01m:02s]
  
  IOs completely get stuck, after a while kernel log shows:
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.230970] INFO: task fio:2545 
blocked for more than 120 seconds.
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.232878]       Not tainted 
5.19.0-43-generic #44~22.04.1-Ubuntu
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.234738] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237053] task:fio             
state:D stack:    0 pid: 2545 ppid:  2495 flags:0x00000002
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237057] Call Trace:
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237058]  <TASK>
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237061]  
__schedule+0x257/0x5d0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237066]  schedule+0x68/0x110
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237068]  io_schedule+0x46/0x80
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237069]  
blk_mq_get_tag+0x117/0x300
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237072]  ? 
destroy_sched_domains_rcu+0x40/0x40
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237076]  
__blk_mq_alloc_requests+0xc4/0x1e0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237079]  
blk_mq_get_new_requests+0xf6/0x1a0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237080]  
blk_mq_submit_bio+0x1eb/0x440
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237082]  
__submit_bio+0x109/0x1a0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237085]  
submit_bio_noacct_nocheck+0xc2/0x120
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237087]  
submit_bio_noacct+0x209/0x590
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237088]  submit_bio+0x40/0xf0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237090]  
__blkdev_direct_IO_async+0x146/0x1f0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237092]  
blkdev_direct_IO.part.0+0x40/0xa0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237093]  
blkdev_read_iter+0x9f/0x1a0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237094]  aio_read+0xec/0x1d0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237097]  ? 
__io_submit_one.constprop.0+0x113/0x200
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237099]  
__io_submit_one.constprop.0+0x113/0x200
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237100]  ? 
__io_submit_one.constprop.0+0x113/0x200
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237101]  
io_submit_one+0xe8/0x3d0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237103]  
__x64_sys_io_submit+0x84/0x190
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237104]  ? 
do_syscall_64+0x69/0x90
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237106]  ? 
do_syscall_64+0x69/0x90
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237107]  
do_syscall_64+0x59/0x90
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237108]  ? 
syscall_exit_to_user_mode+0x2a/0x50
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237110]  ? 
do_syscall_64+0x69/0x90
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237111]  
entry_SYSCALL_64_after_hwframe+0x63/0xcd
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237113] RIP: 
0033:0x7f44f351ea3d
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237116] RSP: 
002b:00007fff1dcfe558 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237117] RAX: ffffffffffffffda 
RBX: 00007f44f2272b68 RCX: 00007f44f351ea3d
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237118] RDX: 000056315d9ad828 
RSI: 0000000000000001 RDI: 00007f44f224f000
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237119] RBP: 00007f44f224f000 
R08: 00007f44e9430000 R09: 00000000000002d8
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237120] R10: 0000000000000000 
R11: 0000000000000246 R12: 0000000000000001
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237120] R13: 0000000000000000 
R14: 000056315d9ad828 R15: 000056315d9e1830
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237122]  </TASK>
  
  This issue can not be reproduced on 5.15 and 6.2 kernels
  
  From call trace, it got stuck more than 120 seconds waiting on previous IOs 
to complete for freeing some tags, so new IO requests are able to obtain tags
  But in fact, not all previous IOs got stuck, at least some of previous IOs 
should have completed, but the waiters were not waken up
  
  This issue is fixed by the upstream commit below which has been merged in 
kernel 6.1
  commit 4acb83417cadfdcbe64215f9d0ddcf3132af808e
  Author: Keith Busch <kbu...@kernel.org>
  Date:   Fri Sep 9 11:40:22 2022 -0700
  
      sbitmap: fix batched wait_cnt accounting
  
      Batched completions can clear multiple bits, but we're only decrementing
      the wait_cnt by one each time. This can cause waiters to never be woken,
      stalling IO. Use the batched count instead.
  
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215679
      Signed-off-by: Keith Busch <kbu...@kernel.org>
      Link: https://lore.kernel.org/r/20220909184022.1709476-1-kbu...@fb.com
      Signed-off-by: Jens Axboe <ax...@kernel.dk>
  
  To SRU the commit to 5.19 kernel, we also need to SRU all its
  dependencies and one further fix:
  
  30514bd2dd4e sbitmap: fix lockup while swapping
  4acb83417cad sbitmap: fix batched wait_cnt accounting
  c35227d4e8cb sbitmap: Use atomic_long_try_cmpxchg in __sbitmap_queue_get_batch
  48c033314f37 sbitmap: Avoid leaving waitqueue in invalid state in 
__sbq_wake_up()
- bce1b56c7382 Revert "sbitmap: fix batched wait_cnt accounting"
- 16ede66973c8 sbitmap: fix batched wait_cnt accounting
  ddbfc34fcf5d sbitmap: remove unnecessary code in __sbitmap_queue_get_batch
  040b83fcecfb sbitmap: fix possible io hung due to lost wakeup
  
  [ Test Plan ]
  
  This can be simply reproduced by launching an instance on AWS EC2
  And run the fio command on a nvme device for few hours to make sure IOs don’t 
get stuck
  
  I’ve built a test kernel with above commits on 5.19.0-43 generic kernel here:
  https://launchpad.net/~gerald-yang-tw/+archive/ubuntu/361041-generic
  
  With this test kernel, fio has been running for few hours without any
  issue
  
  [ Where problems could occur ]
  
  The sbitmap is mainly used in blk-mq in block layer, scsi drivers and 
fungible ethernet driver
  If there is any issue happens in sbitmap, the symptom should be IO hung, or 
packets get stuck in fungible driver

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2022318

Title:
  [SRU] IO gets stuck on 5.19 kernel

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Kinetic:
  In Progress

Bug description:
  [ Impact ]

  When running fio on a NVME on an AWS test instance with 5.19 kernel,
  IOs get stuck and fio never ends

  fio command:
  sudo fio --name=read_iops_test   --filename=/dev/nvme1n1 --filesize=50G   
--time_based --ramp_time=2s --runtime=1m   --ioengine=libaio --direct=1 
--verify=0 --randrepeat=0   --bs=16K --iodepth=256 --rw=randread
  read_iops_test: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 
16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=256
  fio-3.28
  Starting 1 process
  Jobs: 1 (f=0): [/(1)][-.-%][eta 01m:02s]

  IOs completely get stuck, after a while kernel log shows:
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.230970] INFO: task fio:2545 
blocked for more than 120 seconds.
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.232878]       Not tainted 
5.19.0-43-generic #44~22.04.1-Ubuntu
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.234738] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237053] task:fio             
state:D stack:    0 pid: 2545 ppid:  2495 flags:0x00000002
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237057] Call Trace:
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237058]  <TASK>
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237061]  
__schedule+0x257/0x5d0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237066]  schedule+0x68/0x110
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237068]  io_schedule+0x46/0x80
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237069]  
blk_mq_get_tag+0x117/0x300
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237072]  ? 
destroy_sched_domains_rcu+0x40/0x40
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237076]  
__blk_mq_alloc_requests+0xc4/0x1e0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237079]  
blk_mq_get_new_requests+0xf6/0x1a0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237080]  
blk_mq_submit_bio+0x1eb/0x440
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237082]  
__submit_bio+0x109/0x1a0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237085]  
submit_bio_noacct_nocheck+0xc2/0x120
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237087]  
submit_bio_noacct+0x209/0x590
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237088]  submit_bio+0x40/0xf0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237090]  
__blkdev_direct_IO_async+0x146/0x1f0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237092]  
blkdev_direct_IO.part.0+0x40/0xa0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237093]  
blkdev_read_iter+0x9f/0x1a0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237094]  aio_read+0xec/0x1d0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237097]  ? 
__io_submit_one.constprop.0+0x113/0x200
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237099]  
__io_submit_one.constprop.0+0x113/0x200
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237100]  ? 
__io_submit_one.constprop.0+0x113/0x200
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237101]  
io_submit_one+0xe8/0x3d0
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237103]  
__x64_sys_io_submit+0x84/0x190
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237104]  ? 
do_syscall_64+0x69/0x90
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237106]  ? 
do_syscall_64+0x69/0x90
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237107]  
do_syscall_64+0x59/0x90
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237108]  ? 
syscall_exit_to_user_mode+0x2a/0x50
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237110]  ? 
do_syscall_64+0x69/0x90
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237111]  
entry_SYSCALL_64_after_hwframe+0x63/0xcd
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237113] RIP: 
0033:0x7f44f351ea3d
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237116] RSP: 
002b:00007fff1dcfe558 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237117] RAX: ffffffffffffffda 
RBX: 00007f44f2272b68 RCX: 00007f44f351ea3d
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237118] RDX: 000056315d9ad828 
RSI: 0000000000000001 RDI: 00007f44f224f000
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237119] RBP: 00007f44f224f000 
R08: 00007f44e9430000 R09: 00000000000002d8
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237120] R10: 0000000000000000 
R11: 0000000000000246 R12: 0000000000000001
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237120] R13: 0000000000000000 
R14: 000056315d9ad828 R15: 000056315d9e1830
  Jun  1 03:57:52 ip-172-31-39-141 kernel: [  370.237122]  </TASK>

  This issue can not be reproduced on 5.15 and 6.2 kernels

  From call trace, it got stuck more than 120 seconds waiting on previous IOs 
to complete for freeing some tags, so new IO requests are able to obtain tags
  But in fact, not all previous IOs got stuck, at least some of previous IOs 
should have completed, but the waiters were not waken up

  This issue is fixed by the upstream commit below which has been merged in 
kernel 6.1
  commit 4acb83417cadfdcbe64215f9d0ddcf3132af808e
  Author: Keith Busch <kbu...@kernel.org>
  Date:   Fri Sep 9 11:40:22 2022 -0700

      sbitmap: fix batched wait_cnt accounting

      Batched completions can clear multiple bits, but we're only decrementing
      the wait_cnt by one each time. This can cause waiters to never be woken,
      stalling IO. Use the batched count instead.

      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215679
      Signed-off-by: Keith Busch <kbu...@kernel.org>
      Link: https://lore.kernel.org/r/20220909184022.1709476-1-kbu...@fb.com
      Signed-off-by: Jens Axboe <ax...@kernel.dk>

  To SRU the commit to 5.19 kernel, we also need to SRU all its
  dependencies and one further fix:

  30514bd2dd4e sbitmap: fix lockup while swapping
  4acb83417cad sbitmap: fix batched wait_cnt accounting
  c35227d4e8cb sbitmap: Use atomic_long_try_cmpxchg in __sbitmap_queue_get_batch
  48c033314f37 sbitmap: Avoid leaving waitqueue in invalid state in 
__sbq_wake_up()
  ddbfc34fcf5d sbitmap: remove unnecessary code in __sbitmap_queue_get_batch
  040b83fcecfb sbitmap: fix possible io hung due to lost wakeup

  [ Test Plan ]

  This can be simply reproduced by launching an instance on AWS EC2
  And run the fio command on a nvme device for few hours to make sure IOs don’t 
get stuck

  I’ve built a test kernel with above commits on 5.19.0-43 generic kernel here:
  https://launchpad.net/~gerald-yang-tw/+archive/ubuntu/361041-generic

  With this test kernel, fio has been running for few hours without any
  issue

  [ Where problems could occur ]

  The sbitmap is mainly used in blk-mq in block layer, scsi drivers and 
fungible ethernet driver
  If there is any issue happens in sbitmap, the symptom should be IO hung, or 
packets get stuck in fungible driver

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2022318/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to