<<On Wed, 14 May 2025 21:04:20 -0400, Charles Sprickman <[email protected]> said:

> I'm curious if you've tried hot swap since then and what happened?
> And also if you found your original discussion on this anywhere. :)

Negative to both.  Most of our systems are either cheesy little
network servers that don't have NVMe, or big NFS servers that have big
JBODs full of SAS HDDs.  The NVMe servers were an experiment and with
costs as they are now I believe it's a better choice to just stuff the
server full of RAM rather than putting terabytes of NVMe SSD in.  (Our
regular file servers still have NVMe drives for cache and log but
they're internal, not hot-swap.)

> Can anyone else comment on generally how swapping a drive on a ZFS
> system might differ from the "old" SATA/SAS/SCSI way?

On a normal ZFS pool with SATA or SAS drives, you could theoretically
just pull the drive: the HBA would notice, tell GEOM to wither the
provider, which would send a notification up to ZFS saying "not here
any more", and ZFS would record the device as MISSING.  (If you were
using zfsd it would automatically replace it with a spare.)  If you
really wanted to play it safe, you could `zpool offline` or `zpool
detach` the bad drive, then `camcontrol stop` it before removing it.
(Of course in a 60-drive JBOD you probably also want to `sesutil
locate` the faulty drive so you can figure out which one it is when
you're looking inside the enclosure.)

With the old (non-CAM-aware) nvd(4) driver, none of this worked.  With
the new CAMified nda(4) driver, there is at least mechanism to send
notifications up the stack when a drive goes away -- but you still
need the bottom-half code that gets signals from the PCIe bridges that
a device has gone away and that a new device has arrived.  When I
tried this last summer, the kernel didn't even notice that a drive had
been removed, nor did it do proper device enumeration when the same
drive was reinserted in the same slot.

My understanding from some of the discussion last summer was that even
with nda(4) and `options PCI_HP` there were additional administrative
steps required to safely remove and install NVMe disks online.

(The purpose of this exercise was to expand a 32-drive array be
replacing one drive in each vdev seriatim until all vdevs had been
upgraded.  Instead I had to copy the data off to another server,
destroy the pool, replace all the drives, create a new pool, and copy
the data back -- not at all what I had promised the user we could do.)

-GAWollman


Reply via email to