Posted to the wrong list. Adding to illumos-discuss. On Mon, Nov 16, 2020 at 8:22 AM Schweiss, Chip <[email protected]> wrote:
> I have a distressed pool that I cannot get a faulted disk to replace. > > This is the pool as I discovered it this morning: > > pool: drpool04 > state: DEGRADED > status: One or more devices is currently being resilvered. The pool will > continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scan: resilver in progress since Sat Nov 14 11:35:02 2020 > 140T scanned out of 388T at 952M/s, 76h6m to go > 3.20T resilvered, 35.96% done > config: > > NAME STATE READ WRITE CKSUM > drpool04 DEGRADED 0 0 0 > raidz2-0 ONLINE 0 0 0 > c0t5000C500A7174C1Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 2 > c0t5000C500A7166DBBd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 3 > c0t5000C500A715FB3Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 4 > c0t5000C500A71BA81Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 5 > c0t5000C500A716D993d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 6 > c0t5000C500A717C8BFd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 7 > c0t5000C500A709DDFBd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 8 > c0t5000C500A71976E7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 9 > c0t5000C500A716CB5Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 10 > c0t5000C500A7193247d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 11 > c0t5000C500A7196AF3d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 12 > c0t5000C500A716578Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 13 > c0t5000C500A7174387d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 14 > c0t5000C500A71CFB77d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 15 > raidz2-1 ONLINE 0 0 0 > c0t5000C500A717F117d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 16 > c0t5000C500A7161D3Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 17 > c0t5000C500A687FDA7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 18 > c0t5000C500A6865B7Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 19 > c0t5000C500A7192253d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 20 > c0t5000C500A719770Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 21 > c0t5000C500A714668Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 22 > c0t5000C500A71E23F7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 23 > c0t5000C500A715EF3Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 24 > c0t5000C500A71BAE9Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 25 > c0t5000C500A717BD6Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 26 > c0t5000C500A71B91E7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 27 > c0t5000C500A716881Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 28 > c0t5000C500A6865B17d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 29 > raidz2-2 DEGRADED 0 0 0 > c0t5000C500A716FA4Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 30 > c0t5000C500A688424Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 31 > c0t5000C500A716DA2Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 32 > spare-3 FAULTED 0 0 0 > c0t5000C500A7192247d0 FAULTED 0 0 0 external > device fault > c0t5000C500A71CEEFFd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 44 > c0t5000C500A71919F7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 34 > c0t5000C500A717C6B7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 35 > c0t5000C500A716CE77d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 36 > c0t5000C500A6CA7707d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 37 > c0t5000C500A71BB05Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 38 > spare-9 UNAVAIL 0 0 0 > insufficient replicas > 5305257770530967646 UNAVAIL 0 0 0 was > /dev/dsk/c0t5000C500A6F8BDDFd0s0 > 1685164993699279407 FAULTED 0 0 0 was > /dev/dsk/c0t5000C500A71CEEFFd0s0 > c0t5000C500A71DD4A7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 40 > c0t5000C500A718431Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 41 > c0t5000C500A6866C13d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 42 > c0t5000C500A7148C97d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 43 > spares > c0t5000C500A71CEEFFd0 INUSE currently in use | > MIR-DR-RAID90-1_25717 | 44 > c0t5000C500A716C0F3d0 AVAIL | > MIR-DR-RAID90-1_25717 | 45 > > errors: No known data errors > > I then detached the unavail disk followed by a replace, which failed. I > then tried adding another spare to get the replace going. It failed the > same way, until I replaced it while not attached to the pool. However, it > attached to the wrong disk! > > > > # zpool detach drpool04 5305257770530967646 > # ozmt-zpool-status.sh drpool04 > pool: drpool04 > state: DEGRADED > status: One or more devices is currently being resilvered. The pool will > continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scan: resilver in progress since Sat Nov 14 11:35:02 2020 > 142T scanned out of 388T at 961M/s, 74h36m to go > 3.26T resilvered, 36.63% done > config: > > NAME STATE READ WRITE CKSUM > drpool04 DEGRADED 0 0 0 > raidz2-0 ONLINE 0 0 0 > c0t5000C500A7174C1Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 2 > c0t5000C500A7166DBBd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 3 > c0t5000C500A715FB3Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 4 > c0t5000C500A71BA81Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 5 > c0t5000C500A716D993d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 6 > c0t5000C500A717C8BFd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 7 > c0t5000C500A709DDFBd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 8 > c0t5000C500A71976E7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 9 > c0t5000C500A716CB5Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 10 > c0t5000C500A7193247d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 11 > c0t5000C500A7196AF3d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 12 > c0t5000C500A716578Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 13 > c0t5000C500A7174387d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 14 > c0t5000C500A71CFB77d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 15 > raidz2-1 ONLINE 0 0 0 > c0t5000C500A717F117d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 16 > c0t5000C500A7161D3Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 17 > c0t5000C500A687FDA7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 18 > c0t5000C500A6865B7Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 19 > c0t5000C500A7192253d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 20 > c0t5000C500A719770Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 21 > c0t5000C500A714668Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 22 > c0t5000C500A71E23F7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 23 > c0t5000C500A715EF3Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 24 > c0t5000C500A71BAE9Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 25 > c0t5000C500A717BD6Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 26 > c0t5000C500A71B91E7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 27 > c0t5000C500A716881Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 28 > c0t5000C500A6865B17d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 29 > raidz2-2 DEGRADED 0 0 0 > c0t5000C500A716FA4Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 30 > c0t5000C500A688424Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 31 > c0t5000C500A716DA2Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 32 > spare-3 DEGRADED 0 0 0 > c0t5000C500A7192247d0 FAULTED 0 0 0 external > device fault > c0t5000C500A71CEEFFd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 44 > c0t5000C500A71919F7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 34 > c0t5000C500A717C6B7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 35 > c0t5000C500A716CE77d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 36 > c0t5000C500A6CA7707d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 37 > c0t5000C500A71BB05Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 38 > 1685164993699279407 FAULTED 0 0 0 was > /dev/dsk/c0t5000C500A71CEEFFd0s0 > c0t5000C500A71DD4A7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 40 > c0t5000C500A718431Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 41 > c0t5000C500A6866C13d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 42 > c0t5000C500A7148C97d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 43 > spares > c0t5000C500A716C0F3d0 AVAIL | > MIR-DR-RAID90-1_25717 | 45 > > errors: No known data errors > > # zpool replace drpool04 1685164993699279407 c0t5000C500A716C0F3d0 > cannot replace 1685164993699279407 with c0t5000C500A716C0F3d0: already in > replacing/spare config; wait for completion or use 'zpool detach' > > # zpool attach drpool04 1685164993699279407 c0t5000C500A716C0F3d0 > cannot attach c0t5000C500A716C0F3d0 to 1685164993699279407: > c0t5000C500A716C0F3d0 is busy, or device removal is in progress > > # zpool add drpool04 spare c0t5000C500AE8296AFd0 > # zpool replace drpool04 1685164993699279407 c0t5000C500AE8296AFd0 > cannot replace 1685164993699279407 with c0t5000C500AE8296AFd0: already in > replacing/spare config; wait for completion or use 'zpool detach' > > # zpool attach drpool04 1685164993699279407 c0t5000C500AE8296AFd0 > invalid vdev specification > use '-f' to override the following errors: > /dev/dsk/c0t5000C500AE8296AFd0s0 is reserved as a hot spare for ZFS pool > drpool04. Please see zpool(1M). > > # zpool remove drpool04 c0t5000C500AE8296AFd0 > > *# zpool replace drpool04 1685164993699279407 c0t5000C500AE8296AFd0* > > Now it is attached to the wrong disk: > > # ozmt-zpool-status.sh drpool04 > pool: drpool04 > state: DEGRADED > status: One or more devices is currently being resilvered. The pool will > continue to function, possibly in a degraded state. > action: Wait for the resilver to complete. > scan: resilver in progress since Mon Nov 16 07:55:45 2020 > 2.08G scanned out of 389T at 142M/s, (scan is slow, no estimated > time) > 86.7M resilvered, 0.00% done > config: > > NAME STATE READ WRITE CKSUM > drpool04 DEGRADED 0 0 0 > raidz2-0 ONLINE 0 0 0 > c0t5000C500A7174C1Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 2 > c0t5000C500A7166DBBd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 3 > c0t5000C500A715FB3Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 4 > c0t5000C500A71BA81Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 5 > c0t5000C500A716D993d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 6 > c0t5000C500A717C8BFd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 7 > c0t5000C500A709DDFBd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 8 > c0t5000C500A71976E7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 9 > c0t5000C500A716CB5Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 10 > c0t5000C500A7193247d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 11 > c0t5000C500A7196AF3d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 12 > c0t5000C500A716578Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 13 > c0t5000C500A7174387d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 14 > c0t5000C500A71CFB77d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 15 > raidz2-1 ONLINE 0 0 0 > c0t5000C500A717F117d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 16 > c0t5000C500A7161D3Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 17 > c0t5000C500A687FDA7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 18 > c0t5000C500A6865B7Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 19 > c0t5000C500A7192253d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 20 > c0t5000C500A719770Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 21 > c0t5000C500A714668Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 22 > c0t5000C500A71E23F7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 23 > c0t5000C500A715EF3Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 24 > c0t5000C500A71BAE9Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 25 > c0t5000C500A717BD6Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 26 > c0t5000C500A71B91E7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 27 > c0t5000C500A716881Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 28 > c0t5000C500A6865B17d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 29 > raidz2-2 DEGRADED 0 0 0 > c0t5000C500A716FA4Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 30 > c0t5000C500A688424Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 31 > c0t5000C500A716DA2Fd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 32 > spare-3 DEGRADED 0 0 0 > c0t5000C500A7192247d0 FAULTED 0 0 0 > external device fault > replacing-1 ONLINE 0 0 0 > c0t5000C500A71CEEFFd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 44 > c0t5000C500AE8296AFd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 90 > c0t5000C500A71919F7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 34 > c0t5000C500A717C6B7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 35 > c0t5000C500A716CE77d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 36 > c0t5000C500A6CA7707d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 37 > c0t5000C500A71BB05Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 38 > 1685164993699279407 FAULTED 0 0 0 was > /dev/dsk/c0t5000C500A71CEEFFd0s0 > c0t5000C500A71DD4A7d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 40 > c0t5000C500A718431Bd0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 41 > c0t5000C500A6866C13d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 42 > c0t5000C500A7148C97d0 ONLINE 0 0 0 | > MIR-DR-RAID90-1_25717 | 43 > spares > c0t5000C500A716C0F3d0 AVAIL | > MIR-DR-RAID90-1_25717 | 45 > > errors: No known data errors > > Notice the corresponding device IDs. It's almost like a spare was used > twice. > > Any suggestions on how to fix this? > > -Chip > > > > > > ------------------------------------------ illumos: illumos-discuss Permalink: https://illumos.topicbox.com/groups/discuss/Tc5bd41589c784fed-M5e57fe0955be25e4a67cc611 Delivery options: https://illumos.topicbox.com/groups/discuss/subscription
