> On Jul 9, 2018, at 2:39 PM, Ken Merry <k...@freebsd.org> wrote:
>
> Hi ZFS folks,
>
> We (Spectra Logic) have seen some odd behavior with resilvers in RAIDZ3 pools.
>
> The codebase in question is FreeBSD stable/11 from July 2017, at
> approximately FreeBSD SVN version 321310.
>
> We have customer systems with (sometimes) hundreds of SMR drives in RAIDZ3
> vdevs in a large pool. (A typical arrangement is a 23-drive RAIDZ3, and some
> customers will put everything in one giant pool made up of a number of
> 23-drive RAIDZ3 arrays.)
>
> The SMR drives in question have a bug that sometimes causes them to go off
> the SAS bus for up to two minutes. (They’re usually gone a lot less than
> that, up to 10 seconds.) Once they come back online, zfsd puts the drive
> back in the pool and makes it online.
ouch
>
> If a resilver is active on a different drive, once the drive that temporarily
> left comes back, the resilver apparently starts over from the beginning.
>
> This leads to resilvers that take forever to complete, especially on systems
> with high load.
>
> Is this expected behavior?
scans/resilvers are at the DSL layer. The scan thread goes through each dataset
and starts at
the txg needed (full scan starts at txg= effectively 0).
>
> It seems that only one scan can be active on a pool at any given time. Is
> that correct? If so, is that true for an entire top level pool, or just a
> given redundancy group? (In this case, it would be the RAIDZ3 vdev.)
There is one scan thread.
>
> Is there anything we can do to make sure the resilvers complete in a
> reasonable period of time or otherwise improve the behavior? (Short of
> putting in different drives…I have already suggested that.)
There are ways to change or tune the ZIO scheduler, but that won't make SMR
drives any faster.
-- richard
>
> Thanks,
>
> Ken
> —
> Ken Merry
> k...@freebsd.org
>
------------------------------------------
openzfs: openzfs-developer
Permalink:
https://openzfs.topicbox.com/groups/developer/T2a7340f4c0c48fa9-M1557c2e89aed98caf806a17a
Delivery options: https://openzfs.topicbox.com/groups