I was able to capture the attached log messages during the last time
window. I was doing a scrub on 2 different pools simultaneously.

This may not be related to zfs, but may be related to the mpt3sas driver
or the lsi card itself. It sort of looks to me like the mpt3sas driver
is resetting the lsi card, which looks like a power-on event, which
drops the connections to the sas switch. Is that a correct assessment?


** Attachment added: "scsireset_log_capture.txt"
   
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1889110/+attachment/5400025/+files/scsireset_log_capture.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1889110

Title:
  zfs pool locks and see "INFO: task txg_sync:4307 blocked for more than
  120 seconds. "

Status in zfs-linux package in Ubuntu:
  New

Bug description:
  ZFS filesystem becomes unresponsive and subsequent NFS shares
  unresponsive. ESXi sees all paths down.

  See this error 3 times in a row.

  
  [184383.479511] INFO: task txg_sync:4307 blocked for more than 120 seconds.   
                                                                                
                                                  
  [184383.479565]       Tainted: P          IO      5.4.0-42-generic #46-Ubuntu 
                                                                                
                                                  
  [184383.479607] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.                                                                   
                                                    
  [184383.479655] txg_sync        D    0  4307      2 0x80004000                
                                                                                
                                                  
  [184383.479658] Call Trace:                                                   
                                                                                
                                                  
  [184383.479670]  __schedule+0x2e3/0x740                                       
                                                                                
                                                  
  [184383.479673]  schedule+0x42/0xb0                                           
                                                                                
                                                  
  [184383.479676]  schedule_timeout+0x152/0x2f0                                 
                                                                                
                                                  
  [184383.479683]  ? __next_timer_interrupt+0xe0/0xe0                           
                                                                                
                                                  
  [184383.479685]  io_schedule_timeout+0x1e/0x50                                
                                                                                
                                                  
  [184383.479697]  __cv_timedwait_common+0x15e/0x1c0 [spl]                      
                                                                                
                                                  
  [184383.479702]  ? wait_woken+0x80/0x80                                       
                                                                                
                                                  
  [184383.479710]  __cv_timedwait_io+0x19/0x20 [spl]                            
                                                                                
                                                  
  [184383.479816]  zio_wait+0x11b/0x230 [zfs]                                   
                                                                                
                                                  
  [184383.479905]  ? __raw_spin_unlock+0x9/0x10 [zfs]                           
                                                                                
                                                  
  [184383.479983]  dsl_pool_sync+0xbc/0x410 [zfs]                               
                                                                                
                                                  
  [184383.480069]  spa_sync_iterate_to_convergence+0xe0/0x1c0 [zfs]             
                                                                                
                                                  
  [184383.480156]  spa_sync+0x312/0x5b0 [zfs]                                   
                                                                                
                                                  
  [184383.480245]  txg_sync_thread+0x27a/0x310 [zfs]                            
                                                                                
                                                  
  [184383.480334]  ? txg_dispatch_callbacks+0x100/0x100 [zfs]                   
                                                                                
                                                  
  [184383.480344]  thread_generic_wrapper+0x83/0xa0 [spl]                       
                                                                                
                                                  
  [184383.480347]  kthread+0x104/0x140                                          
                                                                                
                                                  
  [184383.480356]  ? clear_bit+0x20/0x20 [spl]                                  
                                                                                
                                                  
  [184383.480358]  ? kthread_park+0x90/0x90                                     
                                                                                
                                                  
  [184383.480361]  ret_from_fork+0x35/0x40                                      


  Then nfsd hangs as well.

  
  [184866.787445] INFO: task nfsd:6585 blocked for more than 120 seconds.
  [184866.787485]       Tainted: P          IO      5.4.0-42-generic #46-Ubuntu
  [184866.787526] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [184866.787573] nfsd            D    0  6585      2 0x80004000
  [184866.787575] Call Trace:
  [184866.787578]  __schedule+0x2e3/0x740
  [184866.787675]  ? __raw_spin_unlock+0x9/0x10 [zfs]
  [184866.787678]  schedule+0x42/0xb0
  [184866.787685]  cv_wait_common+0x133/0x180 [spl]
  [184866.787688]  ? wait_woken+0x80/0x80
  [184866.787695]  __cv_wait+0x15/0x20 [spl]
  [184866.787764]  dmu_tx_wait+0x1ee/0x210 [zfs]
  [184866.787834]  dmu_tx_assign+0x49/0x70 [zfs]
  [184866.787929]  zfs_write+0x461/0xd40 [zfs]
  [184866.788025]  ? atomic_sub_return.constprop.0+0xd/0x20 [zfs]
  [184866.788033]  ? atomic_dec+0xd/0x20 [spl]
  [184866.788116]  ? __raw_spin_unlock+0x9/0x10 [zfs]
  [184866.788122]  ? __d_obtain_alias+0x36/0x90
  [184866.788217]  zpl_write_common_iovec+0xad/0x120 [zfs]
  [184866.788313]  zpl_iter_write_common+0x8e/0xb0 [zfs]
  [184866.788409]  zpl_iter_write+0x56/0x90 [zfs]
  [184866.788413]  do_iter_readv_writev+0x14f/0x1d0
  [184866.788416]  do_iter_write+0x84/0x1a0
  [184866.788418]  vfs_iter_write+0x19/0x30
  [184866.788442]  nfsd_vfs_write+0xe0/0x480 [nfsd]
  [184866.788454]  nfsd_write+0x7a/0x160 [nfsd]
  [184866.788458]  ? kmem_cache_alloc+0x16d/0x230
  [184866.788472]  nfsd3_proc_write+0xc3/0x170 [nfsd]
  [184866.788483]  nfsd_dispatch+0xd6/0x220 [nfsd]
  [184866.788508]  svc_process_common+0x3af/0x700 [sunrpc]
  [184866.788527]  ? svc_sock_secure_port+0x16/0x30 [sunrpc]
  [184866.788538]  ? nfsd_svc+0x2d0/0x2d0 [nfsd]
  [184866.788557]  svc_process+0xd9/0x110 [sunrpc]
  [184866.788568]  nfsd+0xe8/0x150 [nfsd]
  [184866.788570]  kthread+0x104/0x140
  [184866.788581]  ? nfsd_destroy+0x60/0x60 [nfsd]
  [184866.788583]  ? kthread_park+0x90/0x90
  [184866.788585]  ret_from_fork+0x35/0x40


  Linux zfs-01 5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC
  2020 x86_64 x86_64 x86_64 GNU/Linux

  root@zfs-01:/# lsb_release -rd
  Description:    Ubuntu 20.04 LTS
  Release:        20.04

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1889110/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to