On 2019/1/15 下午7:51, David Sterba wrote:
> On Tue, Jan 15, 2019 at 07:48:47PM +0800, Qu Wenruo wrote:
>>> following tree-dumps:
>>>
>>>   sudo btrfs inspect dump-tree -t root /dev/mapper/vg1-root > 
>>> /tmp/btrfsdumproot
>>>   sudo btrfs inspect dump-tree -b 1350630375424 /dev/mapper/vg1-root > 
>>> /tmp/btrfsdump1350630375424
>>>
>>> The root dump is at https://termbin.com/lz0l and the block dump at
>>> https://termbin.com/oev5 . The number 1350630375424 does not occur in
>>> the root dump. The root dump has 16715 lines, the block dump only 645.
>>
>> Super nice move, it shows the corruption and the cause.
>>
>>      item 66 key (1714119835648 METADATA_ITEM 0) itemoff 13325 itemsize 33
>>      item 67 key (10510212874240 METADATA_ITEM 0) itemoff 13283 itemsize 42
>>      item 68 key (1714119868416 METADATA_ITEM 0) itemoff 13250 itemsize 33
>>
>> See the key objectid of key 67 is way larger than item 66/68.
>>
>> And furthermore, it indeed looks like a bit rot:
>> 0x18f19810000 (1714119835648)
>> 0x98f19814000 (10510212874240)
>> 0x18f19818000 (1714119868416)
>>
>> See one bit got flipped.
>>
>> I don't know it's corrupted in memory or on the SSD, although I tend to
>> believe it's caused by memory bit flip.
> 
> Single bit flips are almost always caused by RAM, not storage (that
> fails in larger blocks or does not even return any data)

Yep, as I don't really think a bit flip could sneak in without
triggering both the disk internal csum and the tree block csum.

> 
>> But anyway, it can be fixed by patching the corrupted leaf manually.
> 
> That will fix one instance of the corrupted key, without an analysis how
> far the wrong key got spred it's still risky.

Looking from the content of the culprit leaf, it doesn't look too
problematic.

I would recommend to fix it first, then do a full btrfs check
--readonly, just as all repair routine.

Thanks,
Qu

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to