Hello Josef, > A user sent me a btrfs-image of a file system that was panicing on mount > during > the log recovery. I had originally thought these problems were from a bug in > the free space cache code, but that was just a symptom of the problem. The > problem is if your application does something like this > > [prealloc][prealloc][prealloc] > > the internal extent maps will merge those all together into one extent map, > even > though on disk they are 3 separate extents. So if you go to write into one of > these ranges the extent map will be right since we use the physical extent > when > doing the write, but when we log the extents they will use the wrong sizes for > the remainder prealloc space. If this doesn't happen to trip up the free > space > cache (which it won't in a lot of cases) then you will get bogus entries in > your > extent tree which will screw stuff up later. The data and such will still > work, > but everything else is broken. This patch fixes this by not allowing extents > that are on the modified list to be merged. This has the side effect that we > are no longer adding everything to the modified list all the time, which means > we now have to call btrfs_drop_extents every time we log an extent into the > tree. So this allows me to drop all this speciality code I was using to get > around calling btrfs_drop_extents. With this patch the testcase I've created > no > longer creates a bogus file system after replaying the log. Thanks, > > Signed-off-by: Josef Bacik <jba...@fusionio.com> >
<snip> > while (1) { > write_lock(&em_tree->lock); > - err = add_extent_mapping(em_tree, hole_em); > - if (!err) > - list_move(&hole_em->list, > - &em_tree->modified_extents); > + err = add_extent_mapping(em_tree, hole_em, 1); > write_unlock(&em_tree->lock); > if (err != -EEXIST) > break; > @@ -5989,7 +5977,8 @@ static int merge_extent_mapping(struct extent_map_tree > *em_tree, > em->block_start += start_diff; > em->block_len -= start_diff; > } > - return add_extent_mapping(em_tree, em); > + printk(KERN_ERR "merging here for %Lu\n", em->orig_start); How about using something like pr_debug here. When i tested btrfs-next, i found it hit too much. Thanks, Wang > + return add_extent_mapping(em_tree, em, 0); > } > > static noinline int uncompress_inline(struct btrfs_path *path, > @@ -6283,7 +6272,7 @@ insert: > > err = 0; > write_lock(&em_tree->lock); > - ret = add_extent_mapping(em_tree, em); > + ret = add_extent_mapping(em_tree, em, 0); > /* it is possible that someone inserted the extent into the tree > * while we had the lock dropped. It is also possible that > * an overlapping map exists in the tree > @@ -6706,10 +6695,7 @@ static struct extent_map *create_pinned_em(struct > inode *inode, u64 start, > btrfs_drop_extent_cache(inode, em->start, > em->start + em->len - 1, 0); > write_lock(&em_tree->lock); > - ret = add_extent_mapping(em_tree, em); > - if (!ret) > - list_move(&em->list, > - &em_tree->modified_extents); > + ret = add_extent_mapping(em_tree, em, 1); > write_unlock(&em_tree->lock); > } while (ret == -EEXIST); <snip> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html