On 2019/6/23 下午6:15, Andrei Borzenkov wrote: [snip] >> If the last command reports qgroup mismatch, then it means qgroup is >> indeed incorrect. >> > > no error reported.
Then it's not a bug, and should be caused by btrfs extent booking behavior. > 10:/home/bor # btrfs ins dump-tree -t 258 /dev/vdb > btrfs-progs v5.1 > file tree key (258 ROOT_ITEM 0) > item 5 key (257 INODE_REF 256) itemoff 15869 itemsize 14 > index 2 namelen 4 name: file The inode we care about. > item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53 > generation 11 type 1 (regular) > extent data disk byte 1291976704 nr 46137344 > extent data offset 0 nr 46137344 ram 46137344 44 MiB extent, this should be exclusive for the subvol 258. > item 7 key (257 EXTENT_DATA 46137344) itemoff 15763 itemsize 53 > generation 11 type 1 (regular) > extent data disk byte 1338114048 nr 45875200 > extent data offset 0 nr 45875200 ram 45875200 Another 43.75 Mib extent, also exclusive for 258. > item 8 key (257 EXTENT_DATA 92012544) itemoff 15710 itemsize 53 > generation 11 type 1 (regular) > extent data disk byte 314966016 nr 262144 > extent data offset 0 nr 262144 ram 262144 Another 0.25MiB extent. Also exclusive. > item 9 key (257 EXTENT_DATA 92274688) itemoff 15657 itemsize 53 > generation 11 type 1 (regular) > extent data disk byte 315228160 nr 12582912 > extent data offset 0 nr 12582912 ram 12582912 Another 12.0 MiB extent, also exclusive. BTW, so many fragmented extents, this normally means your system has very high memory pressure or lack of memory, or lack of on-disk space. Above 100MiB should be in one large extent, not split into so many small ones. So 258 have 100 MiB extents exclusive. No problem so far. > item 10 key (257 EXTENT_DATA 104857600) itemoff 15604 itemsize 53 > generation 9 type 1 (regular) > extent data disk byte 227016704 nr 43515904 > extent data offset 15728640 nr 27787264 ram 43515904 From this extents on, data extent at 227016704 (len 41.5M) are all shared with another extent. You can just search the bytenr 227016704, which also shows up in subvol 265. [snip] > file tree key (263 ROOT_ITEM 10) > item 5 key (257 INODE_REF 256) itemoff 15869 itemsize 14 > index 2 namelen 4 name: file Starts from here, that's the inode we care. > item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53 > generation 9 type 1 (regular) > extent data disk byte 137887744 nr 43778048 > extent data offset 0 nr 43778048 ram 43778048 Exclusive, 41.75 MiB. > item 7 key (257 EXTENT_DATA 43778048) itemoff 15763 itemsize 53 > generation 9 type 1 (regular) > extent data disk byte 181665792 nr 1310720 > extent data offset 0 nr 1310720 ram 1310720 Exclusive 1.25MiB. > item 8 key (257 EXTENT_DATA 45088768) itemoff 15710 itemsize 53 > generation 9 type 1 (regular) > extent data disk byte 182976512 nr 43778048 > extent data offset 0 nr 43778048 ram 43778048 Exclusive, 41.76 NiB. > item 9 key (257 EXTENT_DATA 88866816) itemoff 15657 itemsize 53 > generation 9 type 1 (regular) > extent data disk byte 226754560 nr 262144 > extent data offset 0 nr 262144 ram 262144 > extent compression 0 (none) This data extent get shared between subvol 258 and 263. The difference is, subvol 258 only shared part of the extent, while 263 are using the full extent. Btrfs qgroup calculates exclusive based on extents, not bytes, so even only part of the extent get shared, it's still counted as shared. So for subvol 263, your exclusive is 41.75 + 1.25 + 41.76 = 84.75 MiB. In short, due to qgroup works at extents level, not bytes level, you'll find strange behavior. E.g. For my previous script, if on a system with enough free memory, if you only writes 100 MiB, which is smaller than data extent size limit (128MiB), only one subvolume will get 100MiB exclusive while the other one has no exclusive (except the 16K leaf). But if you're writing 128MiB, just at the extent size limit, then both subvolume will have 128MiB exclusive. Thanks, Qu > item 10 key (257 EXTENT_DATA 89128960) itemoff 15604 itemsize 53 > generation 9 type 1 (regular) > extent data disk byte 227016704 nr 43515904 > extent data offset 0 nr 43515904 ram 43515904 > extent compression 0 (none) [snip] > >> Also, I see your subvolume id is not continuous, did you created/removed >> some other subvolumes during your test? >> > > No. At least on this filesystem. I have recreated it several times, but > since the last mkfs these were the only two subvolumes I created myself. >
signature.asc
Description: OpenPGP digital signature