It is worth noting that using dededuplication does consume a lot of
memory.

Deduplication tables consume memory and eventually spill over and
consume disk space - this causes extra read and write operations for
every block of data on which deduplication is attempted cause more I/O
waits.

A system with large pools with small memory areas does not perform
deduplication well.

So, if you run the following command on your pool (pool-ssd in this
example)

sudo zdb -s pool-ssd

..you see something like the following:

  pool: pool-ssd
 state: ONLINE
  scan: scrub repaired 0B in 0 days 00:31:14 with 0 errors on Wed Sep  2 
14:35:52 2020
config:

        NAME                                            STATE     READ WRITE 
CKSUM
        pool-ssd                                        ONLINE       0     0    
 0
          mirror-0                                      ONLINE       0     0    
 0
            ata-INTEL_SSDSC2BW480H6_CVTR608104NW480EGN  ONLINE       0     0    
 0
            ata-INTEL_SSDSC2BW480H6_CVTR552500S0480EGN  ONLINE       0     0    
 0

errors: No known data errors

 dedup: DDT entries 4258028, size 448B on disk, 306B in core

bucket              allocated                       referenced          
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1    3.63M    464G    226G    226G    3.63M    464G    226G    226G
     2     376K   46.1G   24.3G   24.3G     822K    100G   53.2G   53.2G
     4    51.3K   5.60G   2.51G   2.51G     240K   26.3G   11.8G   11.8G
     8    7.49K    925M    544M    544M    77.9K   9.41G   5.47G   5.47G
    16    3.32K    415M    261M    261M    68.8K   8.41G   5.35G   5.35G
    32      657   78.7M   62.1M   62.1M    29.3K   3.51G   2.79G   2.79G
    64      389   46.2M   42.7M   42.7M    34.0K   4.03G   3.78G   3.78G
   128      248   30.1M   29.5M   29.5M    44.8K   5.44G   5.32G   5.32G
   256      339   42.1M   40.8M   40.8M     123K   15.4G   14.9G   14.9G
   512      374   46.8M   46.1M   46.1M     271K   33.9G   33.5G   33.5G
    1K      254   31.8M   31.3M   31.3M     355K   44.4G   43.8G   43.8G
    2K        5    640K    513K    513K    12.9K   1.61G   1.20G   1.20G
    4K        4    512K    512K    512K    22.7K   2.84G   2.84G   2.84G
    8K        8      1M    897K    897K    89.3K   11.2G   10.1G   10.1G
   64K        2    256K      6K      6K     226K   28.3G    679M    679M
 Total    4.06M    517G    254G    254G    5.99M    759G    421G    421G

---
Looking at the "dedup: DDT entries 4258028, size 448B on disk, 306B in core" 
line, we have:

4258028 de-dup entries, each entry used 306 bytes, so that's ~1.2GB of
memory for just the de-dup/ table.  But it is capped at 1/4 the size of
the ARC, so you may find on a small memory system such as a raspberry pi
that a lot of the dedup table can't be stored in memory and needs to be
on-disk, and this increases the I/O waits.

It may be an idea to disable de-dup and see if it helps to resolve your
issue on a comparatively slow and memory constrained system such as a
raspberry pi when you have large pools

** Changed in: zfs-linux (Ubuntu)
       Status: New => Incomplete

** Changed in: zfs-linux (Ubuntu)
   Importance: Undecided => Medium

** Changed in: zfs-linux (Ubuntu)
     Assignee: (unassigned) => Colin Ian King (colin-king)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to zfs-linux in Ubuntu.
https://bugs.launchpad.net/bugs/1899249

Title:
  OpenZFS writing stalls, under load

Status in zfs-linux package in Ubuntu:
  Incomplete

Bug description:
  Using a QNAP 4-drive USB enclosure, with a set of SSDs, on a Raspberry
  Pi 8GB. ZFS deduplication, and LZJB compression is enabled.

  This issue seems to occur, intermittently, after some time (happens
  with both SMB access, via Samba, and when interacting with the system,
  via SSH), and never previously occurred, until a few months ago, and I
  sometimes have to force a reboot of the system (at the cost of some
  data loss), in order to use it again.

  The "dmesg" log reports:

  [25375.911590] z_wr_iss_h      D    0  2161      2 0x00000028
  [25375.911606] Call trace:
  [25375.911627]  __switch_to+0x104/0x170
  [25375.911639]  __schedule+0x30c/0x7c0
  [25375.911647]  schedule+0x3c/0xb8
  [25375.911655]  io_schedule+0x20/0x58
  [25375.911668]  rq_qos_wait+0x100/0x178
  [25375.911677]  wbt_wait+0xb4/0xf0
  [25375.911687]  __rq_qos_throttle+0x38/0x50
  [25375.911700]  blk_mq_make_request+0x128/0x610
  [25375.911712]  generic_make_request+0xb4/0x2d8
  [25375.911722]  submit_bio+0x48/0x218
  [25375.911960]  vdev_disk_io_start+0x670/0x9f8 [zfs]
  [25375.912181]  zio_vdev_io_start+0xdc/0x2b8 [zfs]
  [25375.912400]  zio_nowait+0xd4/0x170 [zfs]
  [25375.912617]  vdev_mirror_io_start+0xa8/0x1b0 [zfs]
  [25375.912839]  zio_vdev_io_start+0x248/0x2b8 [zfs]
  [25375.913057]  zio_execute+0xac/0x110 [zfs]
  [25375.913096]  taskq_thread+0x2f8/0x570 [spl]
  [25375.913108]  kthread+0xfc/0x128
  [25375.913119]  ret_from_fork+0x10/0x1c
  [25375.913149] INFO: task txg_sync:2333 blocked for more than 120 seconds.
  [25375.919916]       Tainted: P         C OE     5.4.0-1018-raspi #20-Ubuntu
  [25375.926848] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [25375.934835] txg_sync        D    0  2333      2 0x00000028
  [25375.934850] Call trace:
  [25375.934869]  __switch_to+0x104/0x170
  [25375.934879]  __schedule+0x30c/0x7c0
  [25375.934887]  schedule+0x3c/0xb8
  [25375.934899]  schedule_timeout+0x9c/0x190
  [25375.934908]  io_schedule_timeout+0x28/0x48
  [25375.934946]  __cv_timedwait_common+0x1a8/0x1f8 [spl]
  [25375.934982]  __cv_timedwait_io+0x3c/0x50 [spl]
  [25375.935205]  zio_wait+0x130/0x2a0 [zfs]
  [25375.935423]  dsl_pool_sync+0x3fc/0x498 [zfs]
  [25375.935650]  spa_sync+0x538/0xe68 [zfs]
  [25375.935867]  txg_sync_thread+0x2c0/0x468 [zfs]
  [25375.935911]  thread_generic_wrapper+0x74/0xa0 [spl]
  [25375.935924]  kthread+0xfc/0x128
  [25375.935935]  ret_from_fork+0x10/0x1c
  [25375.936017] INFO: task zbackup:75339 blocked for more than 120 seconds.
  [25375.942780]       Tainted: P         C OE     5.4.0-1018-raspi #20-Ubuntu
  [25375.949710] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [25375.957702] zbackup         D    0 75339   5499 0x00000000
  [25375.957716] Call trace:
  [25375.957732]  __switch_to+0x104/0x170
  [25375.957742]  __schedule+0x30c/0x7c0
  [25375.957750]  schedule+0x3c/0xb8
  [25375.957789]  cv_wait_common+0x188/0x1b0 [spl]
  [25375.957823]  __cv_wait+0x30/0x40 [spl]
  [25375.958045]  zil_commit_impl+0x234/0xd30 [zfs]
  [25375.958263]  zil_commit+0x48/0x70 [zfs]
  [25375.958481]  zfs_create+0x544/0x7d0 [zfs]
  [25375.958698]  zpl_create+0xb8/0x178 [zfs]
  [25375.958711]  lookup_open+0x4ec/0x6a8
  [25375.958721]  do_last+0x260/0x8c0
  [25375.958730]  path_openat+0x84/0x258
  [25375.958739]  do_filp_open+0x84/0x108
  [25375.958752]  do_sys_open+0x180/0x2b0
  [25375.958763]  __arm64_sys_openat+0x2c/0x38
  [25375.958773]  el0_svc_common.constprop.0+0x80/0x218
  [25375.958781]  el0_svc_handler+0x34/0xa0
  [25375.958791]  el0_svc+0x10/0x2cc
  [25375.958801] INFO: task zbackup:95187 blocked for more than 120 seconds.
  [25375.965564]       Tainted: P         C OE     5.4.0-1018-raspi #20-Ubuntu
  [25375.972492] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [25375.980479] zbackup         D    0 95187   5499 0x00000000
  [25375.980493] Call trace:
  [25375.980514]  __switch_to+0x104/0x170
  [25375.980525]  __schedule+0x30c/0x7c0
  [25375.980536]  schedule+0x3c/0xb8
  [25375.980578]  cv_wait_common+0x188/0x1b0 [spl]
  [25375.980612]  __cv_wait+0x30/0x40 [spl]
  [25375.980834]  zil_commit_impl+0x234/0xd30 [zfs]
  [25375.981052]  zil_commit+0x48/0x70 [zfs]
  [25375.981280]  zfs_write+0xa3c/0xb90 [zfs]
  [25375.981498]  zpl_write_common_iovec+0xac/0x120 [zfs]
  [25375.981726]  zpl_iter_write+0xe4/0x150 [zfs]
  [25375.981766]  new_sync_write+0x100/0x1a8
  [25375.981776]  __vfs_write+0x74/0x90
  [25375.981784]  vfs_write+0xe4/0x1c8
  [25375.981794]  ksys_write+0x78/0x100
  [25375.981803]  __arm64_sys_write+0x24/0x30
  [25375.981813]  el0_svc_common.constprop.0+0x80/0x218
  [25375.981821]  el0_svc_handler+0x34/0xa0
  [25375.981830]  el0_svc+0x10/0x2cc
  [25375.981842] INFO: task zbackup:95188 blocked for more than 120 seconds.
  [25375.988615]       Tainted: P         C OE     5.4.0-1018-raspi #20-Ubuntu
  [25375.995543] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [25376.003532] zbackup         D    0 95188   5499 0x00000008
  [25376.003545] Call trace:
  [25376.003561]  __switch_to+0x104/0x170
  [25376.003571]  __schedule+0x30c/0x7c0
  [25376.003579]  schedule+0x3c/0xb8
  [25376.003619]  cv_wait_common+0x188/0x1b0 [spl]
  [25376.003654]  __cv_wait+0x30/0x40 [spl]
  [25376.003875]  zil_commit_impl+0x234/0xd30 [zfs]
  [25376.004092]  zil_commit+0x48/0x70 [zfs]
  [25376.004312]  zfs_write+0xa3c/0xb90 [zfs]
  [25376.004530]  zpl_write_common_iovec+0xac/0x120 [zfs]
  [25376.004748]  zpl_iter_write+0xe4/0x150 [zfs]
  [25376.004759]  new_sync_write+0x100/0x1a8
  [25376.004768]  __vfs_write+0x74/0x90
  [25376.004776]  vfs_write+0xe4/0x1c8
  [25376.004785]  ksys_write+0x78/0x100
  [25376.004794]  __arm64_sys_write+0x24/0x30
  [25376.004803]  el0_svc_common.constprop.0+0x80/0x218
  [25376.004811]  el0_svc_handler+0x34/0xa0
  [25376.004820]  el0_svc+0x10/0x2cc
  [25376.004917] INFO: task zbackup:95231 blocked for more than 120 seconds.
  [25376.011678]       Tainted: P         C OE     5.4.0-1018-raspi #20-Ubuntu
  [25376.018587] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [25376.026554] zbackup         D    0 95231   5499 0x00000000
  [25376.026562] Call trace:
  [25376.026575]  __switch_to+0x104/0x170
  [25376.026582]  __schedule+0x30c/0x7c0
  [25376.026588]  schedule+0x3c/0xb8
  [25376.026615]  cv_wait_common+0x188/0x1b0 [spl]
  [25376.026636]  __cv_wait+0x30/0x40 [spl]
  [25376.026771]  zil_commit_impl+0x234/0xd30 [zfs]
  [25376.026904]  zil_commit+0x48/0x70 [zfs]
  [25376.027042]  zfs_write+0xa3c/0xb90 [zfs]
  [25376.027174]  zpl_write_common_iovec+0xac/0x120 [zfs]
  [25376.027309]  zpl_iter_write+0xe4/0x150 [zfs]
  [25376.027317]  new_sync_write+0x100/0x1a8
  [25376.027322]  __vfs_write+0x74/0x90
  [25376.027332]  vfs_write+0xe4/0x1c8
  [25376.027338]  ksys_write+0x78/0x100
  [25376.027344]  __arm64_sys_write+0x24/0x30
  [25376.027350]  el0_svc_common.constprop.0+0x80/0x218
  [25376.027355]  el0_svc_handler+0x34/0xa0
  [25376.027361]  el0_svc+0x10/0x2cc
  [25376.027374] INFO: task z_wr_int:95546 blocked for more than 120 seconds.
  [25376.034203]       Tainted: P         C OE     5.4.0-1018-raspi #20-Ubuntu
  [25376.041113] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [25376.049080] z_wr_int        D    0 95546      2 0x00000028
  [25376.049089] Call trace:
  [25376.049100]  __switch_to+0x104/0x170
  [25376.049106]  __schedule+0x30c/0x7c0
  [25376.049111]  schedule+0x3c/0xb8
  [25376.049116]  io_schedule+0x20/0x58
  [25376.049123]  rq_qos_wait+0x100/0x178
  [25376.049129]  wbt_wait+0xb4/0xf0
  [25376.049135]  __rq_qos_throttle+0x38/0x50
  [25376.049143]  blk_mq_make_request+0x128/0x610
  [25376.049150]  generic_make_request+0xb4/0x2d8
  [25376.049156]  submit_bio+0x48/0x218
  [25376.049292]  vdev_disk_io_start+0x670/0x9f8 [zfs]
  [25376.049423]  zio_vdev_io_start+0xdc/0x2b8 [zfs]
  [25376.049554]  zio_nowait+0xd4/0x170 [zfs]
  [25376.049693]  vdev_queue_io_done+0x1ec/0x2a0 [zfs]
  [25376.049847]  zio_vdev_io_done+0xec/0x220 [zfs]
  [25376.049986]  zio_execute+0xac/0x110 [zfs]
  [25376.050010]  taskq_thread+0x2f8/0x570 [spl]
  [25376.050025]  kthread+0xfc/0x128
  [25376.050033]  ret_from_fork+0x10/0x1c
  [25376.050041] INFO: task z_wr_int:95553 blocked for more than 120 seconds.
  [25376.056873]       Tainted: P         C OE     5.4.0-1018-raspi #20-Ubuntu
  [25376.063780] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [25376.071746] z_wr_int        D    0 95553      2 0x00000028
  [25376.071754] Call trace:
  [25376.071764]  __switch_to+0x104/0x170
  [25376.071770]  __schedule+0x30c/0x7c0
  [25376.071775]  schedule+0x3c/0xb8
  [25376.071780]  io_schedule+0x20/0x58
  [25376.071787]  rq_qos_wait+0x100/0x178
  [25376.071792]  wbt_wait+0xb4/0xf0
  [25376.071798]  __rq_qos_throttle+0x38/0x50
  [25376.071806]  blk_mq_make_request+0x128/0x610
  [25376.071813]  generic_make_request+0xb4/0x2d8
  [25376.071819]  submit_bio+0x48/0x218
  [25376.071955]  vdev_disk_io_start+0x670/0x9f8 [zfs]
  [25376.072086]  zio_vdev_io_start+0xdc/0x2b8 [zfs]
  [25376.072217]  zio_nowait+0xd4/0x170 [zfs]
  [25376.072346]  vdev_queue_io_done+0x1ec/0x2a0 [zfs]
  [25376.072476]  zio_vdev_io_done+0xec/0x220 [zfs]
  [25376.072606]  zio_execute+0xac/0x110 [zfs]
  [25376.072629]  taskq_thread+0x2f8/0x570 [spl]
  [25376.072637]  kthread+0xfc/0x128
  [25376.072643]  ret_from_fork+0x10/0x1c
  [25496.722315] INFO: task z_wr_iss_h:2161 blocked for more than 241 seconds.
  [25496.729251]       Tainted: P         C OE     5.4.0-1018-raspi #20-Ubuntu
  [25496.736195] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [25496.744197] z_wr_iss_h      D    0  2161      2 0x00000028
  [25496.744247] Call trace:
  [25496.744268]  __switch_to+0x104/0x170
  [25496.744279]  __schedule+0x30c/0x7c0
  [25496.744288]  schedule+0x3c/0xb8
  [25496.744296]  io_schedule+0x20/0x58
  [25496.744308]  rq_qos_wait+0x100/0x178
  [25496.744317]  wbt_wait+0xb4/0xf0
  [25496.744327]  __rq_qos_throttle+0x38/0x50
  [25496.744340]  blk_mq_make_request+0x128/0x610
  [25496.744351]  generic_make_request+0xb4/0x2d8
  [25496.744362]  submit_bio+0x48/0x218
  [25496.744600]  vdev_disk_io_start+0x670/0x9f8 [zfs]
  [25496.744821]  zio_vdev_io_start+0xdc/0x2b8 [zfs]
  [25496.745039]  zio_nowait+0xd4/0x170 [zfs]
  [25496.745256]  vdev_mirror_io_start+0xa8/0x1b0 [zfs]
  [25496.745473]  zio_vdev_io_start+0x248/0x2b8 [zfs]
  [25496.745690]  zio_execute+0xac/0x110 [zfs]
  [25496.745730]  taskq_thread+0x2f8/0x570 [spl]
  [25496.745742]  kthread+0xfc/0x128
  [25496.745753]  ret_from_fork+0x10/0x1c
  [25496.745853] INFO: task zbackup:75339 blocked for more than 241 seconds.
  [25496.752621]       Tainted: P         C OE     5.4.0-1018-raspi #20-Ubuntu
  [25496.759557] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
  [25496.767547] zbackup         D    0 75339   5499 0x00000000
  [25496.767561] Call trace:
  [25496.767585]  __switch_to+0x104/0x170
  [25496.767595]  __schedule+0x30c/0x7c0
  [25496.767606]  schedule+0x3c/0xb8
  [25496.767646]  cv_wait_common+0x188/0x1b0 [spl]
  [25496.767682]  __cv_wait+0x30/0x40 [spl]
  [25496.767906]  zil_commit_impl+0x234/0xd30 [zfs]
  [25496.768126]  zil_commit+0x48/0x70 [zfs]
  [25496.768355]  zfs_create+0x544/0x7d0 [zfs]
  [25496.768574]  zpl_create+0xb8/0x178 [zfs]
  [25496.768595]  lookup_open+0x4ec/0x6a8
  [25496.768604]  do_last+0x260/0x8c0
  [25496.768614]  path_openat+0x84/0x258
  [25496.768624]  do_filp_open+0x84/0x108
  [25496.768636]  do_sys_open+0x180/0x2b0
  [25496.768647]  __arm64_sys_openat+0x2c/0x38
  [25496.768657]  el0_svc_common.constprop.0+0x80/0x218
  [25496.768665]  el0_svc_handler+0x34/0xa0
  [25496.768674]  el0_svc+0x10/0x2cc

  tyson@ubuntu:~$ lsb_release -rd
  Description:    Ubuntu 20.04.1 LTS
  Release:        20.04

  tyson@ubuntu:~$ apt-cache policy zfs-dkms
  zfs-dkms:
    Installed: 0.8.3-1ubuntu12.4
    Candidate: 0.8.3-1ubuntu12.4
    Version table:
   *** 0.8.3-1ubuntu12.4 500
          500 http://ports.ubuntu.com/ubuntu-ports focal-updates/universe arm64 
Packages
          500 http://ports.ubuntu.com/ubuntu-ports focal-updates/universe armhf 
Packages
          100 /var/lib/dpkg/status
       0.8.3-1ubuntu12 500
          500 http://ports.ubuntu.com/ubuntu-ports focal/universe arm64 Packages
          500 http://ports.ubuntu.com/ubuntu-ports focal/universe armhf Packages

  ProblemType: Bug
  DistroRelease: Ubuntu 20.04
  Package: zfs-dkms 0.8.3-1ubuntu12.4
  ProcVersionSignature: Ubuntu 5.4.0-1018.20-raspi 5.4.55
  Uname: Linux 5.4.0-1018-raspi aarch64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
  ApportVersion: 2.20.11-0ubuntu27.8
  Architecture: arm64
  CasperMD5CheckResult: skip
  Date: Sat Oct 10 08:50:34 2020
  ImageMediaBuild: 20200423.1
  PackageArchitecture: all
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=C.UTF-8
   SHELL=/bin/bash
  SourcePackage: zfs-linux
  UpgradeStatus: No upgrade log present (probably fresh install)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/zfs-linux/+bug/1899249/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to