On Wed, Apr 19, 2017 at 11:44 AM, Henk Slager <eye...@gmail.com> wrote:
> I also have a WD40EZRX and the fs on it is also almost exclusively a > btrfs receive target and it has now for the second time csum (just 5 ) > errors. Extended selftest at 16K hours shows no problem and I am not > fully sure if this is a magnetic media error case or something else. I have now located the 20K (sequential) of bad csums in a 4G file and physical chunk address. Then read that 1G chunk to a file and wrote it back to the same disk location. No I/O errors in dmesg, so my assumption is that the 20K bad spot is replaced by good spares. Or it was a btrfs or luks fault or just a spurious random write somehow due to SW/HW glitch. As a sort of locking the bad area, I did cp --reflink the 4G file to the root of the fs and read-writeback the 20K spot in the 4G file in the send-source fs. So now after another differential receive, I remove all but the latest snapshot. The 5 csum errors will then sit there fixed if I don't balance. Then just before I do a btrfs-repflace (if I decide to ), I delete the 4G file en make sure the cleaner has finished so that replace will not fail on bad the 5 bad csums. The fs on the WD40EZRX is just another clone/backup but with quite some complex subvolume tree. The above actions + replace are more fun and faster cloning again than recreating the tree with rsync etc. I have done similar things in the past, when csum errors were clearly due to btrfs bugs but with good HDDs. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html