Re: [zfs-discuss] Checksum errors with SSD.

2010-07-02 Thread Benjamin Grogg
Dear Cindy and Edward

Many thanks for your input. Indeed there is something wrong with the SSD.
Smartmontools confirm me also couples of errors.
So I open a case and hopefully they will replace the SSD. What I learned?
- Be careful of special offers
- Use also rock solid components for your homeserver
- Use ZFS, Scrub regularly

Best regards and many thanks for all your help and keep up the good work!
Benjamin
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Checksum errors with SSD.

2010-07-01 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Benjamin Grogg
> 
> When I scrub my pool I got a lot of checksum errors :
> 
> NAMESTATE READ WRITE CKSUM
> rpool   DEGRADED 0 0 5
>   c8d0s0DEGRADED 0 071  too many errors
> 
> Any hints?

What's the confusion?  Replace the drive.

If you think it's a false positive (drive is not actually failing) then you
would zpool clear, (or online, or whatever, until the pool looks normal
again) and then scrub.  If the errors come back, it definitely means the
drive is failing.  Or perhaps the sata cable that connects to it, or perhaps
the controller.  But 99% certain the drive.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Checksum errors with SSD.

2010-07-01 Thread Cindy Swearingen

Hi Benjamin,

I'm not familiar with this disk but you can see the fmstat output that
disk, system event, and zfs-related diagnostics are on overtime about
something and its probably this disk.

You can get further details from fmdump -eV and you will probably
see lots of checksum errors on this disk.

You might review some of the h/w diagnostic recommendations in this wiki:

http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide

I would recommend replacing the disk, soon, or figure out what other
issue might be causing problems for this disk.

Thanks,

Cindy
Benjamin Grogg wrote:

Dear Forum

I use a KINGSTON SNV125-S2/30GB SSD on a ASUS M3A78-CM Motherboard (AMD SB700 
Chipset).
SATA Type (in BIOS) is SATA 
Os : SunOS homesvr 5.11 snv_134 i86pc i386 i86pc


When I scrub my pool I got a lot of checksum errors :

NAMESTATE READ WRITE CKSUM
rpool   DEGRADED 0 0 5
  c8d0s0DEGRADED 0 071  too many errors

zpool clear rpool works after a scrub I have again the same situation.
fmstat looks like this :

module ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
cpumem-retire0   0  0.00.0   0   0 0 0  0  0
disk-transport   0   0  0.0 1541.1   0   0 0 032b  0
eft  1   0  0.04.7   0   0 0 0   1.2M  0
ext-event-transport   3   0  0.02.1   0   0 0 0  0  0
fabric-xlate 0   0  0.00.0   0   0 0 0  0  0
fmd-self-diagnosis   6   0  0.00.0   0   0 0 0  0  0
io-retire0   0  0.00.0   0   0 0 0  0  0
sensor-transport 0   0  0.0   37.3   0   0 0 032b  0
snmp-trapgen 3   0  0.01.1   0   0 0 0  0  0
sysevent-transport   0   0  0.0 2836.3   0   0 0 0  0  0
syslog-msgs  3   0  0.02.7   0   0 0 0  0  0
zfs-diagnosis   91  77  0.0   28.9   0   0 2 1   336b   280b
zfs-retire  10   0  0.0  387.9   0   0 0 0   620b  0

fmadm looks like this :

---   -- -
TIMEEVENT-ID  MSG-ID SEVERITY
---   -- -
Jun 30 16:37:28 806072e5-7cd6-efc1-c89d-d40bce4adf72  ZFS-8000-GHMajor 


Host: homesvr
Platform: System-Product-Name   Chassis_id  : System-Serial-Number
Product_sn  : 


Fault class : fault.fs.zfs.vdev.checksum
Affects : zfs://pool=rpool/vdev=f7dad7554a72b3bc
  faulted but still in service
Problem in  : zfs://pool=rpool/vdev=f7dad7554a72b3bc
  faulted but still in service

In /var/adm/messages I don't have any abnormal issues.
I can put the SSD also on a other SATA-Port but without success.

My other HDD runs smoothly :

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
c4d1ONLINE   0 0 0
c5d0ONLINE   0 0 0

iostat gives me following :

c4d1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Model: WDC WD10EVDS-63 Revision:  Serial No:  WD-WCAV592 Size: 1000.20GB <1000202305536 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 
c5d0 Soft Errors: 981 Hard Errors: 0 Transport Errors: 981 
Model: Hitachi HDS7210 Revision:  Serial No:   JP2921HQ0 Size: 1000.20GB <1000202305536 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 
c8d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Model: KINGSTON SSDNOW Revision:  Serial No: 30PM10I Size: 30.02GB <30016659456 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 


Any hints?
Best regards and many thanks for your help!

Benjamin

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Checksum errors with SSD.

2010-07-01 Thread Benjamin Grogg
Dear Forum

I use a KINGSTON SNV125-S2/30GB SSD on a ASUS M3A78-CM Motherboard (AMD SB700 
Chipset).
SATA Type (in BIOS) is SATA 
Os : SunOS homesvr 5.11 snv_134 i86pc i386 i86pc

When I scrub my pool I got a lot of checksum errors :

NAMESTATE READ WRITE CKSUM
rpool   DEGRADED 0 0 5
  c8d0s0DEGRADED 0 071  too many errors

zpool clear rpool works after a scrub I have again the same situation.
fmstat looks like this :

module ev_recv ev_acpt wait  svc_t  %w  %b  open solve  memsz  bufsz
cpumem-retire0   0  0.00.0   0   0 0 0  0  0
disk-transport   0   0  0.0 1541.1   0   0 0 032b  0
eft  1   0  0.04.7   0   0 0 0   1.2M  0
ext-event-transport   3   0  0.02.1   0   0 0 0  0  0
fabric-xlate 0   0  0.00.0   0   0 0 0  0  0
fmd-self-diagnosis   6   0  0.00.0   0   0 0 0  0  0
io-retire0   0  0.00.0   0   0 0 0  0  0
sensor-transport 0   0  0.0   37.3   0   0 0 032b  0
snmp-trapgen 3   0  0.01.1   0   0 0 0  0  0
sysevent-transport   0   0  0.0 2836.3   0   0 0 0  0  0
syslog-msgs  3   0  0.02.7   0   0 0 0  0  0
zfs-diagnosis   91  77  0.0   28.9   0   0 2 1   336b   280b
zfs-retire  10   0  0.0  387.9   0   0 0 0   620b  0

fmadm looks like this :

---   -- -
TIMEEVENT-ID  MSG-ID SEVERITY
---   -- -
Jun 30 16:37:28 806072e5-7cd6-efc1-c89d-d40bce4adf72  ZFS-8000-GHMajor 

Host: homesvr
Platform: System-Product-Name   Chassis_id  : System-Serial-Number
Product_sn  : 

Fault class : fault.fs.zfs.vdev.checksum
Affects : zfs://pool=rpool/vdev=f7dad7554a72b3bc
  faulted but still in service
Problem in  : zfs://pool=rpool/vdev=f7dad7554a72b3bc
  faulted but still in service

In /var/adm/messages I don't have any abnormal issues.
I can put the SSD also on a other SATA-Port but without success.

My other HDD runs smoothly :

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
c4d1ONLINE   0 0 0
c5d0ONLINE   0 0 0

iostat gives me following :

c4d1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Model: WDC WD10EVDS-63 Revision:  Serial No:  WD-WCAV592 Size: 1000.20GB 
<1000202305536 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 
c5d0 Soft Errors: 981 Hard Errors: 0 Transport Errors: 981 
Model: Hitachi HDS7210 Revision:  Serial No:   JP2921HQ0 Size: 1000.20GB 
<1000202305536 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 
Illegal Request: 0 
c8d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 
Model: KINGSTON SSDNOW Revision:  Serial No: 30PM10I Size: 30.02GB 
<30016659456 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 

Any hints?
Best regards and many thanks for your help!

Benjamin
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss