[zfs-code] [storage-discuss] Hung Pools

Andrey Kuzmin Thu, 6 Nov 2008 12:52:02 +0300

On Thu, Nov 6, 2008 at 1:50 AM, Eric Sproul <esproul at omniti.com> wrote:
> Ben Rockwood wrote:
>> This is troubling because it means one disk can go wonky and render your
>> storage system useless until someone can respond, and I'd imagine most
>> admins would "solve" the problem via reboot, a very poor solution.
>
> At the risk of being just AOL-style "me too" noise... this has bitten us too,
> but in our case it seemed to be poor hardware/drivers.  The system had the 
> same
> pathology you describe, but the errors we retryable writes to a disk that had
> failed.  The controller (Adaptec SATA RAID, aac) would not fail the device,
> instead it just kept issuing retryables and the effect was the same.  The only


One would expect file-system to time-out and subsequently
disconnect/fail respective drive. Peculiar zfs with its self-healing
doesn't.

Regards,
Andrey

> way I could recover was to shut down and physically detach the offending disk.
>
> My disks show up as SCSI in this scenario, and IIRC (memory is hazy from the
> stress) cfgadm commands were failing as well.
>
> Eric
> _______________________________________________
> storage-discuss mailing list
> storage-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/storage-discuss
>

[zfs-code] [storage-discuss] Hung Pools

Reply via email to