Hi Stephan,

> On Jan 25, 2017, at 5:54 AM, Stephan Budach <[email protected]> wrote:
> 
> Hi guys,
> 
> I have been trying to import a zpool, based on a 3way-mirror provided by 
> three omniOS boxes via iSCSI. This zpool had been working flawlessly until 
> some random reboot of the S11.1 host. Since then, S11.1 has been importing 
> this zpool without success.
> 
> This zpool consists of three 108TB LUNs, based on a raidz-2 zvols… yeah I 
> know, we shouldn't have done that in the first place, but performance was not 
> the primary goal for that, as this one is a backup/archive pool.
> 
> When issueing a zpool import, it says this:
> 
> root@solaris11atest2:~# zpool import
>   pool: vsmPool10
>     id: 12653649504720395171
>  state: DEGRADED
> status: The pool was last accessed by another system.
> action: The pool can be imported despite missing or damaged devices.  The
>         fault tolerance of the pool may be compromised if imported.
>    see: http://support.oracle.com/msg/ZFS-8000-EY 
> <http://support.oracle.com/msg/ZFS-8000-EY>
> config:
> 
>         vsmPool10                                  DEGRADED
>           mirror-0                                 DEGRADED
>             c0t600144F07A3506580000569398F60001d0  DEGRADED  corrupted data
>             c0t600144F07A35066C00005693A0D90001d0  DEGRADED  corrupted data
>             c0t600144F07A35001A00005693A2810001d0  DEGRADED  corrupted data
> 
> device details:
> 
>         c0t600144F07A3506580000569398F60001d0    DEGRADED         
> scrub/resilver needed
>         status: ZFS detected errors on this device.
>                 The device is missing some data that is recoverable.
> 
>         c0t600144F07A35066C00005693A0D90001d0    DEGRADED         
> scrub/resilver needed
>         status: ZFS detected errors on this device.
>                 The device is missing some data that is recoverable.
> 
>         c0t600144F07A35001A00005693A2810001d0    DEGRADED         
> scrub/resilver needed
>         status: ZFS detected errors on this device.
>                 The device is missing some data that is recoverable.
> 
> However, when  actually running zpool import -f vsmPool10, the system starts 
> to perform a lot of writes on the LUNs and iostat report an alarming increase 
> in h/w errors:
> 
> root@solaris11atest2:~# iostat -xeM 5
>                          extended device statistics         ---- errors ---
> device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b s/w h/w trn tot
> sd0       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0   0   0   0
> sd1       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0   0   0   0
> sd2       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0  71   0  71
> sd3       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0   0   0   0
> sd4       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0   0   0   0
> sd5       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0   0   0   0
>                          extended device statistics         ---- errors ---
> device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b s/w h/w trn tot
> sd0      14.2  147.3    0.7    0.4  0.2  0.1    2.0   6   9   0   0   0   0
> sd1      14.2    8.4    0.4    0.0  0.0  0.0    0.3   0   0   0   0   0   0
> sd2       0.0    4.2    0.0    0.0  0.0  0.0    0.0   0   0   0  92   0  92
> sd3     157.3   46.2    2.1    0.2  0.0  0.7    3.7   0  14   0  30   0  30
> sd4     123.9   29.4    1.6    0.1  0.0  1.7   10.9   0  36   0  40   0  40
> sd5     142.5   43.0    2.0    0.1  0.0  1.9   10.2   0  45   0  88   0  88
>                          extended device statistics         ---- errors ---
> device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b s/w h/w trn tot
> sd0       0.0  234.5    0.0    0.6  0.2  0.1    1.4   6  10   0   0   0   0
> sd1       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0   0   0   0
> sd2       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0  92   0  92
> sd3       3.6   64.0    0.0    0.5  0.0  4.3   63.2   0  63   0 235   0 235
> sd4       3.0   67.0    0.0    0.6  0.0  4.2   60.5   0  68   0 298   0 298
> sd5       4.2   59.6    0.0    0.4  0.0  5.2   81.0   0  72   0 406   0 406
>                          extended device statistics         ---- errors ---
> device    r/s    w/s   Mr/s   Mw/s wait actv  svc_t  %w  %b s/w h/w trn tot
> sd0       0.0  234.8    0.0    0.7  0.4  0.1    2.2  11  10   0   0   0   0
> sd1       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0   0   0   0
> sd2       0.0    0.0    0.0    0.0  0.0  0.0    0.0   0   0   0  92   0  92
> sd3       5.4   54.4    0.0    0.3  0.0  2.9   48.5   0  67   0 384   0 384
> sd4       6.0   53.4    0.0    0.3  0.0  4.6   77.7   0  87   0 519   0 519
> sd5       6.0   60.8    0.0    0.3  0.0  4.8   72.5   0  87   0 727   0 727

h/w errors are a classification of other errors. The full error list is 
available from "iostat -E" and will
be important to tracking this down.

A better, more detailed analysis can be gleaned from the "fmdump -e" ereports 
that should be 
associated with each h/w error. However, there are dozens of causes of these so 
we don’t have
enough info here to fully understand.
 — richard

> 
> 
> I have tried pulling data from the LUNs using dd to /dev/null and I didn't 
> get any h/w error, this just started, when trying to actually import the 
> zpool. As the h/w errors are constantly rising, I am wondering what could 
> cause this and if there can something be done about this?
> 
> Cheers,
> Stephan
> _______________________________________________
> OmniOS-discuss mailing list
> [email protected]
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

_______________________________________________
OmniOS-discuss mailing list
[email protected]
http://lists.omniti.com/mailman/listinfo/omnios-discuss

Reply via email to