On 2015-12-01 07:57, Gareth Pye wrote:
Poking around I just noticed that btrfs de stats /data points out that
3 of my drives have some read_io_errors. I'm guessing that is a bad
thing. I assume this would indicate bad hardware and would be a likely
cause of system crashes.
In general, given that info, I would suggest that you do the following:
1. Run btrfs device stats -z to reset the counters (they're running counts stored on disk, not counts of recent errors or errors since last boot, so the numbers are probably over the lifetime of the filesystem right now). 2. Run a scrub on the filesystem (if you add -Bd, you get stats per-device when it's done, although it runs in the foreground). If the scrub reports no errors, it's less likely that the issue is hardware than software (or just the system having crashed). 3. Regardless of the scrub results, use smartctl (usually found in a package called smartmontools or something similar) to check what the disk firmware thinks about how healthy the disk hardware is. Interpreting anything beyond the SMART attributes and the SMART health status is somewhat difficult without a lot of experience and some significant low-level knowledge of the hardware and software, but if the disk says it's healthy (check smartctl -H, and possibly smartctl -A), then it's _probably_ OK. 4. Check your kernel logs for messages about ATA link resets. If you see a number of these, check your cables. If the cables are fine (securely connected, don't appear damaged), then this may be an early indication of failing hardware (although there are other non-failure hardware issues this can be indicative of).

In general, read-errors are not a huge issue as long as you scrub the filesystem regularly (unless you get a lot in a short period of time, in which case you should be worried). When you start getting write errors or link resets (like mentioned in step 4 above), or when the SMART pre-failure attributes hit their thresholds is when you should be getting worried and start actively looking for a replacement disk (and verifying your backups).

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to