On Sat, Feb 2, 2019 at 10:40 PM Alan Hardman <al...@fastmail.com> wrote:
> # journalctl | grep -A 15 exception > ... > Jan 23 01:06:37 localhost kernel: ata3.00: status: { DRDY } > Jan 23 01:06:37 localhost kernel: ata3.00: failed command: WRITE FPDMA QUEUED > Jan 23 01:06:37 localhost kernel: ata3.00: cmd > 61/b0:98:ea:7a:48/00:00:0a:00:00/40 tag 19 ncq dma 90112 out > res > 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) > -- > Jan 31 19:24:32 localhost kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr > 0x0 action 0x6 frozen > Jan 31 19:24:32 localhost kernel: ata5.00: failed command: READ DMA EXT > Jan 31 19:24:32 localhost kernel: ata5.00: cmd > 25/00:08:a8:2a:81/00:00:a3:03:00/e0 tag 0 dma 4096 in > res > 40/00:01:00:00:00/00:00:00:00:00/10 Emask 0x4 (timeout) > Jan 31 19:24:32 localhost kernel: ata5.00: status: { DRDY } > Jan 31 19:24:32 localhost kernel: ata5: link is slow to respond, please be > patient (ready=0) > Jan 31 19:24:32 localhost kernel: ata5: device not ready (errno=-16), forcing > hardreset > Jan 31 19:24:32 localhost kernel: ata5: soft resetting link This kind of error on read is common when there is a marginally bad sector, and the drive is doing deep recovery, and the time for recovery is longer than the SCSI command timer. The reset clears the command queue, meaning it's no longer possible to find out what sector is marginal, and Btrfs can't fix it up. When left for a long time, it allows bad sectors to accumulate, including during scrubbing. I don't actually know what Btrfs does in this case, if the scrub is aborted, or if it continues and shows the error as uncorrectable. This isn't the only possible explanation for this kind of error. But the bottom line is that it's generic, indicates there is a problem, and you need to find out what the real cause is. It's asking for trouble to just leave this kind of error floating in the wind, even if it seems like everything is otherwise working. > Jan 31 19:24:32 localhost kernel: ata5.00: configured for UDMA/33 > Jan 31 19:24:32 localhost kernel: ata5.01: configured for UDMA/33 > Jan 31 19:24:32 localhost kernel: sd 4:0:0:0: [sde] tag#0 FAILED Result: > hostbyte=DID_OK driverbyte=DRIVER_SENSE > Jan 31 19:24:32 localhost kernel: sd 4:0:0:0: [sde] tag#0 Sense Key : Illegal > Request [current] > Jan 31 19:24:32 localhost kernel: sd 4:0:0:0: [sde] tag#0 Add. Sense: > Unaligned write command > Jan 31 19:24:32 localhost kernel: sd 4:0:0:0: [sde] tag#0 CDB: Read(16) 88 00 > 00 00 00 03 a3 81 2a a8 00 00 00 08 00 00 > I don't know what the portion of this error sequence means, "illegal request" and "unaligned write command". On 512e Advanced Format hard drives, which have 512 byte logical 4096 byte physical sectors, a 512 byte write must actually be converted to 4096 byte read by the drive firmware in order to do a read write modify. It can't do an overwrite of just those 512 bytes. If there's a problem reading that sector, it comes back as a read error. But... I'm not sure any of that is what this is. To my knowledge Btrfs never does 512 byte writes, the minimum read or write is 4096 bytes. Moving on... >From your attached log, there are lots of failed writes. That really must be sorted out because for all raid, write failures are fatal. With md based raid, it'll eject the drive as faulty on a single write failure. Whereas Btrfs keeps trying, as it has no concept of faulty drives still. So you've got recent missing writes to whatever drive has all these write errors. The 'grep -A 15' limited the output so I can't tell how the error ended up being handled either by libata or Btrfs. You really need to find out what device has write problems as the next priority, as this is decently likely to prevent any 'btrfs check --repair' from succeeding. Next the SCSI command timer (this is a kernel timeout per block device) for all devices is the default of 30 seconds. Most drives have SCT ERC of 70 deciseconds which is *good*. But two drives do not, and instead I see: SCT Error Recovery Control command not supported That's not good because it's possible marginally bad sectors will cause the drive firmwar Next, one of those drives has some UDMA errors which suggests an actual link problem between the drive's controller and the logic board's controller - the most logical suspect is just reseating the cable on both ends. But maybe the cable needs replacing. So you'll want to keep an eye on whether these errors continue or not. They'll trigger a libata error message and it'll either retry or reset the link, and therefore might be what caused one of your link reset errors. 199 UDMA_CRC_Error_Count -OSRCK 200 200 000 - 19 OK so you have a corrupt leaf probably from bad memory, which probably corrupted both copies of the leaf, which is why we don't see any fixup messages. I don't actually know if 'btrfs check --repair' can fix this kind of bad memory induced corruption. But in any case, we've got to get the other hardware problems figured out before a repair is assured to stick. > I have been able to successfully recover files via "btrfs restore ...", and > there doesn't seem to be anything essential missing from its full output with > -D, so if that's necessary to use to offload the entire filesystem, it at > least seems possible if it can't be recovered directly. If the data is important, and if this is the only copy, I always argue in favor of urgently making a backup. Set aside all other troubleshooting. -- Chris Murphy