Re: read time tree block corruption detected

2021-04-20 Thread Gervais, Francois
> On 2021/4/19 下午10:56, Gervais, Francois wrote:
>>> My bad, wrong number.
>>>
>>> The correct number command is:
>>> # btrfs ins dump-tree -b 790151168 /dev/loop0p3
>>
>>
>> root@debug:~# btrfs ins dump-tree -b 790151168 /dev/loop0p3
>> btrfs-progs v5.7
> [...]
>>    item 4 key (5007 INODE_ITEM 0) itemoff 15760 itemsize 160
>>    generation 294 transid 219603 size 0 nbytes 
>>18446462598731726987
> 
> The nbytes looks very strange.
> 
> It's 0x0xfffeffef008b, which definitely looks aweful for an empty inode.
> 
>>    block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0
>>    sequence 476091 flags 0x0(none)
>>    atime 1610373772.750632843 (2021-01-11 14:02:52)
>>    ctime 1617477826.205928110 (2021-04-03 19:23:46)
>>    mtime 1617477826.205928110 (2021-04-03 19:23:46)
>>    otime 0.0 (1970-01-01 00:00:00)
>>    item 5 key (5007 INODE_REF 4727) itemoff 15732 itemsize 28
>>    index 0 namelen 0 name:
>>    index 0 namelen 0 name:
>>    index 0 namelen 294 name:
> 
> Definitely corrupted. I'm afraid tree-checker is correct.
> 
> The log tree is corrupted.
> And the check to detect such corrupted inode ref is only introduced in
> v5.5 kernel, no wonder v5.4 kernel didn't catch it at runtime.

Would detecting it at runtime with a newer kernel have helped in any way with
the corruption?

> 
> I don't have any idea why this could happen, as it doesn't look like an
> obvious memory flip.

The test engineer says that the last thing he did was remove power from the
device.

Could power loss be the cause of this issue?

> 
> Maybe Filipe could have some clue on this?
> 
> Thanks,
> Qu
> 
>>    item 6 key (5041 INODE_ITEM 0) itemoff 15572 itemsize 160
>>    generation 295 transid 219603 size 4096 nbytes 4096
>>    block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0
>>    sequence 321954 flags 0x0(none)
>>    atime 1610373832.763235044 (2021-01-11 14:03:52)
>>    ctime 1617477815.541863825 (2021-04-03 19:23:35)
>>    mtime 1617477815.541863825 (2021-04-03 19:23:35)
>>    otime 0.0 (1970-01-01 00:00:00)
>>    item 7 key (5041 INODE_REF 4727) itemoff 15544 itemsize 28
>>    index 12 namelen 18 name: health_metrics.txt
>>    item 8 key (5041 EXTENT_DATA 0) itemoff 15491 itemsize 53
>>    generation 219603 type 1 (regular)
>>    extent data disk byte 12746752 nr 4096
>>    extent data offset 0 nr 4096 ram 4096
>>    extent compression 0 (none)
>>    item 9 key (EXTENT_CSUM EXTENT_CSUM 12746752) itemoff 15487 itemsize 4
>>    range start 12746752 end 12750848 length 4096
>>

Re: read time tree block corruption detected

2021-04-19 Thread Gervais, Francois
> My bad, wrong number.
>
> The correct number command is:
> # btrfs ins dump-tree -b 790151168 /dev/loop0p3


root@debug:~# btrfs ins dump-tree -b 790151168 /dev/loop0p3
btrfs-progs v5.7 
leaf 790151168 items 10 free space 15237 generation 219603 owner TREE_LOG
leaf 790151168 flags 0x1(WRITTEN) backref revision 1
fs uuid 29d53427-f943-43ad-a99e-ac695d225d0b
chunk uuid 04c4bf25-55ac-487e-97a3-fbdc84961b4a
item 0 key (4614 INODE_ITEM 0) itemoff 16123 itemsize 160
generation 282 transid 219603 size 0 nbytes 0
block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0
sequence 1345948 flags 0x0(none)
atime 1610373764.218465480 (2021-01-11 14:02:44)
ctime 1617477830.389953334 (2021-04-03 19:23:50)
mtime 1617477830.389953334 (2021-04-03 19:23:50)
otime 606208.1 (1970-01-08 00:23:28)
item 1 key (4614 INODE_REF 1020) itemoff 16101 itemsize 22
index 1217 namelen 12 name: brokerStatus
item 2 key (4996 INODE_ITEM 0) itemoff 15941 itemsize 160
generation 290 transid 219603 size 0 nbytes 0
block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0
sequence 4801736 flags 0x0(none)
atime 1617304887.496533028 (2021-04-01 19:21:27)
ctime 1617477830.681955095 (2021-04-03 19:23:50)
mtime 1617477830.681955095 (2021-04-03 19:23:50)
otime 0.0 (1970-01-01 00:00:00)
item 3 key (4996 INODE_REF 4715) itemoff 15920 itemsize 21
index 9 namelen 11 name: scodes.conf
item 4 key (5007 INODE_ITEM 0) itemoff 15760 itemsize 160
generation 294 transid 219603 size 0 nbytes 18446462598731726987
block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0
sequence 476091 flags 0x0(none)
atime 1610373772.750632843 (2021-01-11 14:02:52)
ctime 1617477826.205928110 (2021-04-03 19:23:46)
mtime 1617477826.205928110 (2021-04-03 19:23:46)
otime 0.0 (1970-01-01 00:00:00)
item 5 key (5007 INODE_REF 4727) itemoff 15732 itemsize 28
index 0 namelen 0 name: 
index 0 namelen 0 name: 
index 0 namelen 294 name: 
item 6 key (5041 INODE_ITEM 0) itemoff 15572 itemsize 160
generation 295 transid 219603 size 4096 nbytes 4096
block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0
sequence 321954 flags 0x0(none)
atime 1610373832.763235044 (2021-01-11 14:03:52)
ctime 1617477815.541863825 (2021-04-03 19:23:35)
mtime 1617477815.541863825 (2021-04-03 19:23:35)
otime 0.0 (1970-01-01 00:00:00)
item 7 key (5041 INODE_REF 4727) itemoff 15544 itemsize 28
index 12 namelen 18 name: health_metrics.txt
item 8 key (5041 EXTENT_DATA 0) itemoff 15491 itemsize 53
generation 219603 type 1 (regular)
extent data disk byte 12746752 nr 4096
extent data offset 0 nr 4096 ram 4096
extent compression 0 (none)
item 9 key (EXTENT_CSUM EXTENT_CSUM 12746752) itemoff 15487 itemsize 4
range start 12746752 end 12750848 length 4096


Re: read time tree block corruption detected

2021-04-19 Thread Gervais, Francois
> Please provide the following dump:
>   #btrfs ins dump-tree -b 18446744073709551610 /dev/loop0p3
>
> I'm wondering why write-time tree-check didn't catch it.
>
> Thanks,
> Qu

I get:

root@debug:~# btrfs ins dump-tree -b 18446744073709551610 /dev/loop0p3
btrfs-progs v5.7 
ERROR: tree block bytenr 18446744073709551610 is not aligned to sectorsize 4096

We have an unusual partition table due to an hardware (cpu) requirement.
This might be the source of this error?

Disk /dev/loop0: 40763392 sectors, 19.4 GiB
Sector size (logical/physical): 512/512 bytes
Disk identifier (GUID): A18E4543-634A-4E8C-B55D-DA1E217C4D98
Partition table holds up to 24 entries
Main partition table begins at sector 2 and ends at sector 7
First usable sector is 8, last usable sector is 40763384
Partitions will be aligned on 8-sector boundaries
Total free space is 0 sectors (0 bytes)

Number  Start (sector)End (sector)  Size   Code  Name
   1   8   32775   16.0 MiB8300  
   2   32776  237575   100.0 MiB   8300  
   3  23757640763384   19.3 GiB8300  






read time tree block corruption detected

2021-04-16 Thread Gervais, Francois
We are using btrfs on one of our embedded devices and we got filesystem 
corruption on one of them.

This product undergo a lot of tests on our side and apparently it's the first 
it happened so it seems to be a pretty rare occurrence. However we still want 
to get to the bottom of this to ensure it doesn't happen in the future.

Some background:
- The corruption happened on kernel v5.4.72.
- On the debug device I'm on master (v5.12.0-rc7) hoping it might help to have 
all the latest patches.

Here what kernel v5.12.0-rc7 tells me when trying to mount the partition:

Apr 16 19:31:45 buildroot kernel: BTRFS info (device loop0p3): disk space 
caching is enabled
Apr 16 19:31:45 buildroot kernel: BTRFS info (device loop0p3): has skinny 
extents
Apr 16 19:31:45 buildroot kernel: BTRFS info (device loop0p3): start tree-log 
replay
Apr 16 19:31:45 buildroot kernel: BTRFS critical (device loop0p3): corrupt 
leaf: root=18446744073709551610 block=790151168 slot=5 ino=5007, inode ref 
overflow, ptr 15853 end 15861 namelen 294
Apr 16 19:31:45 buildroot kernel: BTRFS error (device loop0p3): block=790151168 
read time tree block corruption detected
Apr 16 19:31:45 buildroot kernel: BTRFS critical (device loop0p3): corrupt 
leaf: root=18446744073709551610 block=790151168 slot=5 ino=5007, inode ref 
overflow, ptr 15853 end 15861 namelen 294
Apr 16 19:31:45 buildroot kernel: BTRFS error (device loop0p3): block=790151168 
read time tree block corruption detected
Apr 16 19:31:45 buildroot kernel: BTRFS: error (device loop0p3) in 
btrfs_recover_log_trees:6246: errno=-5 IO failure (Couldn't read tree log root.)
Apr 16 19:31:45 buildroot kernel: BTRFS: error (device loop0p3) in 
btrfs_replay_log:2341: errno=-5 IO failure (Failed to recover log tree)
Apr 16 19:31:45 buildroot e512c123daaa[468]: mount: /root/mnt: can't read 
superblock on /dev/loop0p3.
Apr 16 19:31:45 buildroot kernel: BTRFS error (device loop0p3): open_ctree 
failed: -5

Any suggestions?

Filesystem corruption?

2018-10-22 Thread Gervais, Francois
Hi,

I think I lost power on my btrfs disk and it looks like it is now in an 
unfunctional state.

Any idea how I could debug that issue?

Here is what I have:

kernel 4.4.0-119-generic
btrfs-progs v4.4



sudo btrfs check /dev/sdd
Checking filesystem on /dev/sdd
UUID: 9a14b7a1-672c-44da-b49a-1f6566db3e44
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
checking quota groups
Ignoring qgroup relation key 310
Ignoring qgroup relation key 311
Ignoring qgroup relation key 313
Ignoring qgroup relation key 321
Ignoring qgroup relation key 326
Ignoring qgroup relation key 346
Ignoring qgroup relation key 354
Ignoring qgroup relation key 355
Ignoring qgroup relation key 356
Ignoring qgroup relation key 367
Ignoring qgroup relation key 370
Ignoring qgroup relation key 371
Ignoring qgroup relation key 373
Ignoring qgroup relation key 71213169107796323
Ignoring qgroup relation key 71213169107796323
Ignoring qgroup relation key 71494644084506935
Ignoring qgroup relation key 71494644084506935
Ignoring qgroup relation key 71494644084506937
Ignoring qgroup relation key 71494644084506937
Ignoring qgroup relation key 71494644084506945
Ignoring qgroup relation key 71494644084506945
Ignoring qgroup relation key 71494644084506950
Ignoring qgroup relation key 71494644084506950
Ignoring qgroup relation key 71494644084506970
Ignoring qgroup relation key 71494644084506970
Ignoring qgroup relation key 71494644084506978
Ignoring qgroup relation key 71494644084506978
Ignoring qgroup relation key 71494644084506978
Ignoring qgroup relation key 71494644084506980
Ignoring qgroup relation key 71494644084506980
Ignoring qgroup relation key 71494644084506991
Ignoring qgroup relation key 71494644084506991
Ignoring qgroup relation key 71494644084506994
Ignoring qgroup relation key 71494644084506994
Ignoring qgroup relation key 71494644084506995
Ignoring qgroup relation key 71494644084506995
Ignoring qgroup relation key 71494644084506997
Ignoring qgroup relation key 71494644084506997
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
Ignoring qgroup relation key 71776119061217590
found 29301522460 bytes used err is 0
total csum bytes: 27525424
total tree bytes: 541573120
total fs tree bytes: 494223360
total extent tree bytes: 16908288
btree space waste bytes: 85047903
file data blocks allocated: 273892241408
 referenced 44667650048
extent buffer leak: start 29360128 len 16384
extent buffer leak: start 740524032 len 16384
extent buffer leak: start 446840832 len 16384
extent buffer leak: start 142819328 len 16384
extent buffer leak: start 143179776 len 16384
extent buffer leak: start 184107008 len 16384
extent buffer leak: start 190513152 len 16384
extent buffer leak: start 190939136 len 16384
extent buffer leak: start 239943680 len 16384
extent buffer leak: start 29392896 len 16384
extent buffer leak: start 295223296 len 16384
extent buffer leak: start 30556160 len 16384
extent buffer leak: start 29376512 len 16384
extent buffer leak: start 29409280 len 16384
extent buffer leak: start 29491200 len 16384
extent buffer leak: start 29556736 len 16384
extent buffer leak: start 29720576 len 16384
extent buffer leak: start 29884416 len 16384
extent buffer leak: start 30097408 len 16384
extent buffer leak: start 30179328 len 16384
extent buffer leak: start 30228480 len 16384
extent buffer leak: start 30277632 len 16384
extent buffer leak: start 30343168 len 16384
extent buffer leak: start 30392320 len 16384
extent buffer leak: start 30457856 len 16384
extent buffer leak: start 30507008 len 16384
extent buffer leak: start 30572544 len 16384
extent buffer leak: start 30621696 len 16384
extent buffer leak: start 30670848 len 16384
extent buffer leak: start 3072 len 16384
extent buffer leak: start 30769152 len 16384
extent buffer leak: start 30801920 len 16384
extent buffer leak: start 30867456 len 16384
extent buffer leak: start 30916608 len 16384
extent buffer leak: start 102498304 len 16384
extent buffer leak: start 204488704 len 16384
extent buffer leak: start 237912064 len 16384
extent buffer leak: start 328499200 len 16384
extent buffer leak: start 684539904 len 16384
extent buffer leak: start 849362944 len 16384


CoW behavior when writing same content

2018-10-09 Thread Gervais, Francois
Hi,

If I have a snapshot where I overwrite a big file but which only a
small portion of it is different, will the whole file be rewritten in
the snapshot? Or only the different part of the file?

Something like:

$ dd if=/dev/urandom of=/big_file bs=1M count=1024
$ cp /big_file root/
$ btrfs sub snap root snapshot
$ cp /big_file snapshot/

In this case is root/big_file and snapshot/big_file still share the same data?

Thank you


Re: btrfs receive incremental stream on another uuid

2018-09-18 Thread Gervais, Francois
> No. It is already possible (by setting received UUID); it should not be
made too open to easy abuse.


Do you mean edit the UUID in the byte stream before btrfs receive?


btrfs receive incremental stream on another uuid

2018-09-18 Thread Gervais, Francois


Hi,

I'm trying to apply a btrfs send diff (done through -p) to another subvolume 
with the same content as the proper parent but with a different uuid.

I looked through btrfs receive and I get the feeling that this is not possible 
right now.

I'm thinking of adding a -p option to btrfs receive which could override the 
parent information from the stream.

Would that make sense?