On Thu, Nov 6, 2008 at 1:50 AM, Eric Sproul <esproul at omniti.com> wrote: > Ben Rockwood wrote: >> This is troubling because it means one disk can go wonky and render your >> storage system useless until someone can respond, and I'd imagine most >> admins would "solve" the problem via reboot, a very poor solution. > > At the risk of being just AOL-style "me too" noise... this has bitten us too, > but in our case it seemed to be poor hardware/drivers. The system had the > same > pathology you describe, but the errors we retryable writes to a disk that had > failed. The controller (Adaptec SATA RAID, aac) would not fail the device, > instead it just kept issuing retryables and the effect was the same. The only
One would expect file-system to time-out and subsequently disconnect/fail respective drive. Peculiar zfs with its self-healing doesn't. Regards, Andrey > way I could recover was to shut down and physically detach the offending disk. > > My disks show up as SCSI in this scenario, and IIRC (memory is hazy from the > stress) cfgadm commands were failing as well. > > Eric > _______________________________________________ > storage-discuss mailing list > storage-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/storage-discuss >