Hi Andy, answer & pointer below... Andrew Hisgen wrote: > Question embedded below... > > Richard Elling wrote: > ... >> If you surf to http://www.sun.com/msg/ZFS-8000-HC you'll >> see words to the effect that, >> The pool has experienced I/O failures. Since the ZFS pool property >> 'failmode' is set to 'wait', all I/Os (reads and writes) are >> blocked. See the zpool(1M) manpage for more information on the >> 'failmode' property. Manual intervention is required for I/Os to >> be serviced. >> >>> >>> I would guess that ZFS is attempting to write to the disk in the >>> background, and that this is silently failing. >> >> It is clearly not silently failing. >> >> However, the default failmode property is set to "wait" which will >> patiently >> wait forever. If you would rather have the I/O fail, then you should >> change >> the failmode to "continue" I would not normally recommend a failmode of >> "panic" > > Hi Richard, > > Does failmode==wait cause ZFS itself to retry i/o, that is, to retry an > i/o where an earlier request (of that same i/o) returned from the driver > with an error? If so, that will compound timeouts even further. > > I'm also confused by your statement that wait means wait forever, given > that the actual circumstances here are that zfs (and the rest of the > i/o stack) returned after 9 minutes.
The details are in PSARC/2007/567. Externally available at: http://www.opensolaris.org/os/community/arc/caselog/2007/567/ With failmode=wait, I/Os will wait until "manual intervention" which is shown as an administrator running zpool clear on the affected pool. I see the need for a document to help people work through these cases as they can be complex at many different levels. -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss