> 
> zpool status -x output would be useful. These error reports do not
> include a
> pointer to the faulty device. fmadm can also give more info.
> 

Yes. Thanks.

> mpathadm can be used to determine the device paths for this disk.
> 
> Notice how the disk is offline at multiple times. There is some sort of
> recovery going on here that continues to fail later. I call these
> "wounded
> soldiers" because they take a lot more care than a dead soldier. You
> would be better off if the drive completely died.
> 

I think it only works in mpts2(sas2) where multi-path is forcedly enabled.
I agree the disk was a sort of "critical status" before died. The difficult
point is the OS can NOT automatically off the wounded disk in mid-night(
maybe cause the coming scsi reset storm), nobody can do it at all.

> 
> In my experience they start randomly and in some cases are not
> reproducible.
> 

It seems sort of agnostic? Isn't it? :-)

> 
> Are you asking for fault tolerance?  If so, then you need a fault
> tolerant system like
> a Tandem. If you are asking for a way to build a cost effective
> solution using
> commercial, off-the-shelf (COTS) components, then that is far beyond
> what can be easily
> said in a forum posting.
>  -- richard

Yeah. High availability is another topic which has more technical challenges.

Anyway, thank you very much.

Fred
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to