Hello,

I'm finishing up my data migration to Btrfs, and I've run into an
error that I'm trying to explore in more detail. I'm using Fedora 20
with Btrfs v0.20-rc1.

My array is a 5 disk (4x 1TB and 1x 2TB) RAID 6 (-d raid6 -m raid6). I
completed my rsync to this array, and I figured that it would be
prudent to run a scrub before I consider this array the canonical
version of my data. The scrub is still running, but I current have the
following status:

~$ btrfs scrub status t
scrub status for 7b7afc82-f77c-44c0-b315-669ebd82f0c5
scrub started at Mon Feb 24 20:10:54 2014, running for 86080 seconds
total bytes scrubbed: 2.71TiB with 1 errors
error details: read=1
corrected errors: 0, uncorrectable errors: 1, unverified errors: 0

It is accompied by the following messages in the journal:

Feb 25 15:16:24 localhost kernel: ata4.00: exception Emask 0x0 SAct
0x3f SErr 0x0 action 0x0
Feb 25 15:16:24 localhost kernel: ata4.00: irq_stat 0x40000008
Feb 25 15:16:24 localhost kernel: ata4.00: failed command: READ FPDMA QUEUED
Feb 25 15:16:24 localhost kernel: ata4.00: cmd
60/08:08:b8:24:af/00:00:58:00:00/40 tag 1 ncq 4096 in
                                           res
41/40:00:be:24:af/00:00:58:00:00/40 Emask 0x409 (media error) <F>
Feb 25 15:16:24 localhost kernel: ata4.00: status: { DRDY ERR }
Feb 25 15:16:24 localhost kernel: ata4.00: error: { UNC }
Feb 25 15:16:24 localhost kernel: ata4.00: configured for UDMA/133
Feb 25 15:16:24 localhost kernel: sd 3:0:0:0: [sdd] Unhandled sense code
Feb 25 15:16:24 localhost kernel: sd 3:0:0:0: [sdd]
Feb 25 15:16:24 localhost kernel: Result: hostbyte=DID_OK
driverbyte=DRIVER_SENSE
Feb 25 15:16:24 localhost kernel: sd 3:0:0:0: [sdd]
Feb 25 15:16:24 localhost kernel: Sense Key : Medium Error [current]
[descriptor]
Feb 25 15:16:24 localhost kernel: Descriptor sense data with sense
descriptors (in hex):
Feb 25 15:16:24 localhost kernel:         72 03 11 04 00 00 00 0c 00
0a 80 00 00 00 00 00
Feb 25 15:16:24 localhost kernel:         58 af 24 be
Feb 25 15:16:24 localhost kernel: sd 3:0:0:0: [sdd]
Feb 25 15:16:24 localhost kernel: Add. Sense: Unrecovered read error -
auto reallocate failed
Feb 25 15:16:24 localhost kernel: sd 3:0:0:0: [sdd] CDB:
Feb 25 15:16:24 localhost kernel: Read(10): 28 00 58 af 24 b8 00 00 08 00
Feb 25 15:16:24 localhost kernel: end_request: I/O error, dev sdd,
sector 1487873214
Feb 25 15:16:24 localhost kernel: ata4: EH complete
Feb 25 15:16:24 localhost kernel: btrfs: i/o error at logical
2285387870208 on dev /dev/sdf1, sector 1488392888, root 5, inode
357715, offset 48787456, length 4096, links 1 (path:
PATH/TO/REDACTED_FILE)
Feb 25 15:16:24 localhost kernel: btrfs: bdev /dev/sdf1 errs: wr 0, rd
1, flush 0, corrupt 0, gen 0
Feb 25 15:16:24 localhost kernel: btrfs: unable to fixup (regular)
error at logical 2285387870208 on dev /dev/sdf1

I have a few questions:

* How is "total bytes scrubbed" determined? This array only has 2.2TB
of space used, so I'm confused about how many total bytes need to be
scrubbed before it is finished.

* What is the best way to recover from this error? If I delete
PATH/TO/REDACTED_FILE and recopy it, will everything be okay? (I found
a thread on the Arch Linux forums,
https://bbs.archlinux.org/viewtopic.php?id=170795, that mentions this
as a solution, but I can't tell if it's the proper method.

* Should I run another scrub? (I'd like to avoid another scrub if
possible because the scrub has been running for 24 hours already.)

* When a scrub is not running, is there any `btrfs` command that will
show me corrected and uncorrectable errors that occur during normal
operation? I guess something similar to `mdadm -D`.

* It seems like this type of error shouldn't happen on RAID6 as there
should be enough information to recover between the data, p parity,
and q parity. Is this just an implementation limitation of the current
RAID 5/6 code?

Thanks,
Justin
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to