On 22/10/2015 12:07, Gabriele Bulfon wrote:
Yes, I understand, but why the zpool is not signaling the problem?
I had to face the problem while doing a cp, and then through iostat.
When is ZFS taking account of the problem and putting it offline or sending 
advice through fault management?
----------------------------------------------------------------------------------
Da:  Udo Grabowski
 [email protected]
 Gabriele Bulfon
Data: 22 ottobre 2015 11.41.37 CEST
Oggetto: Re: [discuss] iostat errors on zpool mirror
On 22/10/2015 11:27, Gabriele Bulfon wrote:
Hi,
I have a falining sata device inside a mirrored zpool.
Shouldn't I have failures on the zpool device too?
Any big file copy on to the mirror will make the system VERY slow, and I have 
to kill the cp.
How can I dig more the problem?
sonicle@xstreamserver:~# iostat -e
---- errors ---
device  s/w h/w trn tot
sd0       0   5   0   5
sd1       0   0   0   0
sd2       0   0   0   0
sd3       0   0   0   0
sd4       0   0   0   0
sd5       0   0   0   0
sd6       0   0   0   0
sd7       0   0   0   0
sd13      0  33   0  33
nfs1      0   0   0   0
sonicle@xstreamserver:~# kstat -n sd13,err
module: sderr                           instance: 13
name:   sd13,err                        class:    device_error
Hard Errors                     33
Illegal Request                 223
Media Error                     24
Not necessary to dig more, that disk is going south, get a new
one. If you trust your other disk, try 'zpool offline' on the
failing one and draw a backup quickly before replacing.
You can 'set sd:sd_io_time = 0x20' in /etc/system (or via mdb
for immediate effect to shorten the outages, but that disk
will plague you indefinitely until it finally dies (which can
last several weeks, we had this pest a couple of times this year).
--
Dr.Udo Grabowski   Inst.f.Meteorology &Climate Research IMK-ASF-SAT
http://www.imk-asf.kit.edu/english/sat.php
KIT - Karlsruhe Institute of Technology           http://www.kit.edu
Postfach 3640,76021 Karlsruhe,Germany T:(+49)721 608-26026 F:-926026

Yes, I understand, but why the zpool is not signaling the problem?
I had to face the problem while doing a cp, and then through iostat.
When is ZFS taking account of the problem and putting it offline or sending
advice through fault management?

It signals when it's faulted, but it checks a disk a couple of times
with sd timeout, and when it comes back, zfs is happy again. Since
it tries to maintain the integrity of the mirror, it will always
wait to synchronize the second disk. Disk failures like these are
extremely annoying since the disk often manages to get around a
problem within this timeout, therefore lower the sd timeout to
get more failures counted by fm, so that the disk will be taken
out earlier due to excessive errors. We sometimes go down to
5 seconds when we cannot get the disk out by other means, and,
if nothing helps, we simply pull the disk to force a fault.
Self healing capabilities often have a price tag, too....

--
Dr.Udo Grabowski   Inst.f.Meteorology & Climate Research IMK-ASF-SAT
http://www.imk-asf.kit.edu/english/sat.php
KIT - Karlsruhe Institute of Technology           http://www.kit.edu
Postfach 3640,76021 Karlsruhe,Germany T:(+49)721 608-26026 F:-926026

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature




-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to