As the investigation about unexpected btrfs corruption goes on, here we expose an strange v1 space cache corruption.
The script is updated to gist: https://gist.github.com/adam900710/d37f38070f7fc4d858ffe856c516b426 The script itself is pretty straight forward: 0) Create a btrfs with large enough data chunk Original single data chunk created by mkfs is not large enough. Do a full balance to create a large enough data chunk, so space cache will live in a data chunk which also has its own cache. 1) Does some fsstress load along with dm-log-writes. The load is pretty small. Just -n 200 could reproduce it. dm-log-writes will record all the operations to later analyse. 2) Use dm-log-writes to replay to each FLUSH and FUA operations and do fsck In the script, it does this manually, just to check both FUA and FLUSH. In fact we can use --check fua option to do it in one line. Although btrfs check won't return error as it detects invalid free space cache and just ignore them, but we can get free space cache related error prompt. Then we can get some free space cache corruption in both flush and fua operations. And some of them can even survive across *several* transaction. Further more, when such corruption happens, space cache file extent seems to be CoWed, instead of being overwritten. In my test environment, the whole 64K file extent of metadata block group cache just get CoWed. (In previous trans, its bytenr is XXX by in next trans it's YYY, and the inode size doesn't change at all, but nbytes seems is increasing) Although kernel and btrfs check can both report such problem due to free space bytes difference, but that's already the last defensing line. The corrupted free space cache passes both generation and csum check. I'll keep digging while advice from anyone who is familiar with free space cache would really help in this case. Thanks, Qu
signature.asc
Description: OpenPGP digital signature