On 2017年12月10日 07:12, Benjamin Beichler wrote: > Hi Qu, > > 2017-12-07 12:09 GMT+00:00 Qu Wenruo <quwenruo.bt...@gmx.com>: >> >> Since the btrfs chunk recovery doesn't work and my dirty quick hack >> doesn't work either, I don't expect much to recovery. >> >> Unless we have more detailed info about the how and why the BUG_ON() of >> chunk recovery is triggered. >> >> That's to say, it will be a quite time consuming work to use gdb to >> locate the problem, and see if any developer (mostly me) could use the >> info to further dig into the problem or fix it. >> (Considering the difference in timezone, I expect at least 8+ weeks to >> get a conclusion) > > I'm really pleased that you want to help me, of course the current > backtrace was quite useless. > Firstly, I revised the code a bit, and since one run over the 1,7TB > drive took about 6h, I thought about saving the state of already found > chunks. I simply saved all bytenr which are valid to a file. The > consequence was a reduction of the time for scan_one_device to about > 30s. If you think this could be interesting for the normal version, I > could create a patch for this. > >> >> If you really want to do it, please step into the function >> btrfs_insert_item() in __rebuild_device_items() and to see at which >> point -EIO is returned. >> >> My guess is btrfs_search_slot() call in btrfs_insert_empty_items(). >> >> If that's true, please call >> >> btrfs_print_tree(root->fs_info->chunk_root, root->fs_info->chunk_root->node, >> 1) >> >> in gdb, just before the btrfs_search_slot() call above, to show what's >> the problem. >> > Your guess was right. The current stack trace and btrfs_print_tree is > under : https://gist.github.com/anonymous/2cf40ac1d3ddcbca95177acec78041b2
The output is very helpful. I was originally thinking it's something more serious, but it turns out to be less serious than my expectation. > > As you can see, the code in disk.io:321 explicitly exclude the the > sector from 0 to sectorsize, and states it is unaligned. I think > because the code found a chunk/block at address zero, this triggers > the problem. Is it possible, that there live chunks/blocks at address > 0 or is this fuzzy data? 0 is completely valid in btrfs logical address space. It's the IS_ALIGNED macro which caused the problem. So it's quite easy to fix in fact. For 0, always return it as aligned should fix your problem. Thanks, Qu > >> >> BTW, currently nothing in chunk tree/super block contains any info of >> your fs, feel free to share it with the mail list, where more guys may help. >> > I added the list, I simply forgot it in some answer. > >> Thanks, >> Qu >> > > thanks > > Benjamin >
signature.asc
Description: OpenPGP digital signature