Vitaliy,

Thanks for the details.  I wasn't aware of the 'freeing' property.  That is
very useful to see progress.

There's plenty of space on the pool both now and when the delete started.
 No checkpoint, no dedup.  This is a raidz3 pool of 90 12TB disks.

I've been bumping zfs_free_min_time_ms but it has only has minor
influence.  It currently set to 100000.  Should I keep bumping this by
orders of magnitude?  I'd rather hobble the pool temporarily to work
through this crippling problem.

# zpool list hcpdr03
NAME      SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP
 HEALTH  ALTROOT
hcpdr03  1.02P   631T   416T        -         -    10%    60%  1.00x
 DEGRADED  -

# zpool get freeing hcpdr03
NAME     PROPERTY  VALUE    SOURCE
hcpdr03  freeing   125T     default

# zpool status hcpdr03|head
  pool: hcpdr03
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Jun  1 21:52:38 2020
        1 scanned out of 639T at 1/s, (scan is slow, no estimated time)
    0 resilvered, 0.00% done

It dropped a disk about two weeks ago and progress is almost non-existant.
It was rebooted yesterday.  It was about 5% complete before the reboot.
Previously, this pool would resilver in 5-7 days.

I/O is relatively low for this pool:
# zpool iostat hcpdr03
               capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
hcpdr03      631T   416T    118    553   507K  10.2M

# iostat -xn |head
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   24.6 8910.7  115.4 56272.5 10.9 13.2    1.2    1.5   9  40 hcpdr01
   28.9 1823.5  125.7 14127.5  3.0  2.1    1.6    1.1   4  34 hcpdr02
  160.1 2279.3  687.9 21067.8  8.7  8.3    3.5    3.4   3  22 hcpdr03

-Chip






On Mon, Jun 1, 2020 at 11:03 PM Vitaliy Gusev <[email protected]>
wrote:

> 1. Can you play with zfs_free_min_time_ms ? Default value is 1/5 of the
> txg sync time (zfs_txg_timeout).
>
> unsigned int zfs_free_min_time_ms = 1000; /* min millisecs to free per txg
> */
>
> Also It could be that reading metadata for freeing is slow (due to ARC
> constraints or heavy I/O or fragmented pool on HDD) and this also could
> lead to side effect then metadata cannot be read effectively enough to be
> ready within zfs_txg_timeout seconds and blocks’ freeing is postponed to
> the next spa-sync.   Look at dsl_scan_async_block_should_pause() for
> details.
>
> 2. Don't you have set checkpoint on the pool ? It can break reclaiming if
> there was no enough space, look at spa_suspend_async_destroy() for more
> details.
>
> 3. Don’t you have enabled dedup ? Data blocks can be referenced in this
> case and will not be freed.
>
> BTW, Do you see "zpool get freeing $pool” shows 150TB ?
>
> ———
> Vitaliy Gusev
>
>
>
> On 1 Jun 2020, at 21:34, Schweiss, Chip <[email protected]> wrote:
>
> These are ZFS folders that were destroyed. Snapshots and all.
>
> zfs destroy -r {folder}
>
> It is not instant. This too goes on the delete queue which recycles blocks
> in the background.
>
>
>

------------------------------------------
illumos: illumos-discuss
Permalink: 
https://illumos.topicbox.com/groups/discuss/T51c43cca03b19c45-Maaae9a685fb0d747b9281bc9
Delivery options: https://illumos.topicbox.com/groups/discuss/subscription

Reply via email to