On 2019/1/15 下午7:51, David Sterba wrote: > On Tue, Jan 15, 2019 at 07:48:47PM +0800, Qu Wenruo wrote: >>> following tree-dumps: >>> >>> sudo btrfs inspect dump-tree -t root /dev/mapper/vg1-root > >>> /tmp/btrfsdumproot >>> sudo btrfs inspect dump-tree -b 1350630375424 /dev/mapper/vg1-root > >>> /tmp/btrfsdump1350630375424 >>> >>> The root dump is at https://termbin.com/lz0l and the block dump at >>> https://termbin.com/oev5 . The number 1350630375424 does not occur in >>> the root dump. The root dump has 16715 lines, the block dump only 645. >> >> Super nice move, it shows the corruption and the cause. >> >> item 66 key (1714119835648 METADATA_ITEM 0) itemoff 13325 itemsize 33 >> item 67 key (10510212874240 METADATA_ITEM 0) itemoff 13283 itemsize 42 >> item 68 key (1714119868416 METADATA_ITEM 0) itemoff 13250 itemsize 33 >> >> See the key objectid of key 67 is way larger than item 66/68. >> >> And furthermore, it indeed looks like a bit rot: >> 0x18f19810000 (1714119835648) >> 0x98f19814000 (10510212874240) >> 0x18f19818000 (1714119868416) >> >> See one bit got flipped. >> >> I don't know it's corrupted in memory or on the SSD, although I tend to >> believe it's caused by memory bit flip. > > Single bit flips are almost always caused by RAM, not storage (that > fails in larger blocks or does not even return any data)
Yep, as I don't really think a bit flip could sneak in without triggering both the disk internal csum and the tree block csum. > >> But anyway, it can be fixed by patching the corrupted leaf manually. > > That will fix one instance of the corrupted key, without an analysis how > far the wrong key got spred it's still risky. Looking from the content of the culprit leaf, it doesn't look too problematic. I would recommend to fix it first, then do a full btrfs check --readonly, just as all repair routine. Thanks, Qu
signature.asc
Description: OpenPGP digital signature