On 2014-11-18 02:29, Brendan Hide wrote:
Hey, guys

See further below extracted output from a daily scrub showing csum
errors on sdb, part of a raid1 btrfs. Looking back, it has been getting
errors like this for a few days now.

The disk is patently unreliable but smartctl's output implies there are
no issues. Is this somehow standard faire for S.M.A.R.T. output?

Here are (I think) the important bits of the smartctl output for
$(smartctl -a /dev/sdb) (the full results are attached):
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED
WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000f   100   253   006    Pre-fail
Always       -       0
   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail
Always       -       1
   7 Seek_Error_Rate         0x000f   086   060   030    Pre-fail
Always       -       440801014
197 Current_Pending_Sector  0x0012   100   100   000    Old_age
Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age
Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age
Offline      -       0
202 Data_Address_Mark_Errs  0x0032   100   253   000    Old_age
Always       -       0



-------- Original Message --------
Subject:     Cron <root@watricky> /usr/local/sbin/btrfs-scrub-all
Date:     Tue, 18 Nov 2014 04:19:12 +0200
From:     (Cron Daemon) <root@watricky>
To:     brendan@watricky



WARNING: errors detected during scrubbing, corrected.
[snip]
scrub device /dev/sdb2 (id 2) done
     scrub started at Tue Nov 18 03:22:58 2014 and finished after 2682
seconds
     total bytes scrubbed: 189.49GiB with 5420 errors
     error details: read=5 csum=5415
     corrected errors: 5420, uncorrectable errors: 0, unverified errors:
164
[snip]

In addition to the storage controller being a possibility as mentioned in another reply, there are some parts of the drive that aren't covered by SMART attributes on most disks, most notably the on-drive cache. There really isn't a way to disable the read cache on the drive, but you can disable write-caching, which may improve things (and if it's a cheap disk, may provide better reliability for BTRFS as well). The other thing I would suggest trying is a different data cable to the drive itself, I've had issues with some SATA cables (the cheap red ones you get in the retail packaging for some hard disks in particular) having either bad connectors, or bad strain-reliefs, and failing after only a few hundred hours of use.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to