Hello Josef,

> A user sent me a btrfs-image of a file system that was panicing on mount 
> during
> the log recovery.  I had originally thought these problems were from a bug in
> the free space cache code, but that was just a symptom of the problem.  The
> problem is if your application does something like this
> 
> [prealloc][prealloc][prealloc]
> 
> the internal extent maps will merge those all together into one extent map, 
> even
> though on disk they are 3 separate extents.  So if you go to write into one of
> these ranges the extent map will be right since we use the physical extent 
> when
> doing the write, but when we log the extents they will use the wrong sizes for
> the remainder prealloc space.  If this doesn't happen to trip up the free 
> space
> cache (which it won't in a lot of cases) then you will get bogus entries in 
> your
> extent tree which will screw stuff up later.  The data and such will still 
> work,
> but everything else is broken.  This patch fixes this by not allowing extents
> that are on the modified list to be merged.  This has the side effect that we
> are no longer adding everything to the modified list all the time, which means
> we now have to call btrfs_drop_extents every time we log an extent into the
> tree.  So this allows me to drop all this speciality code I was using to get
> around calling btrfs_drop_extents.  With this patch the testcase I've created 
> no
> longer creates a bogus file system after replaying the log.  Thanks,
> 
> Signed-off-by: Josef Bacik <jba...@fusionio.com>
>  

<snip>
>                       while (1) {
>                               write_lock(&em_tree->lock);
> -                             err = add_extent_mapping(em_tree, hole_em);
> -                             if (!err)
> -                                     list_move(&hole_em->list,
> -                                               &em_tree->modified_extents);
> +                             err = add_extent_mapping(em_tree, hole_em, 1);
>                               write_unlock(&em_tree->lock);
>                               if (err != -EEXIST)
>                                       break;
> @@ -5989,7 +5977,8 @@ static int merge_extent_mapping(struct extent_map_tree 
> *em_tree,
>               em->block_start += start_diff;
>               em->block_len -= start_diff;
>       }
> -     return add_extent_mapping(em_tree, em);
> +     printk(KERN_ERR "merging here for %Lu\n", em->orig_start);

        How about using something like pr_debug here.
        When i tested btrfs-next, i found it hit too much.


Thanks,
Wang
        
> +     return add_extent_mapping(em_tree, em, 0);
> }
> 
> static noinline int uncompress_inline(struct btrfs_path *path,
> @@ -6283,7 +6272,7 @@ insert:
> 
>       err = 0;
>       write_lock(&em_tree->lock);
> -     ret = add_extent_mapping(em_tree, em);
> +     ret = add_extent_mapping(em_tree, em, 0);
>       /* it is possible that someone inserted the extent into the tree
>        * while we had the lock dropped.  It is also possible that
>        * an overlapping map exists in the tree
> @@ -6706,10 +6695,7 @@ static struct extent_map *create_pinned_em(struct 
> inode *inode, u64 start,
>               btrfs_drop_extent_cache(inode, em->start,
>                               em->start + em->len - 1, 0);
>               write_lock(&em_tree->lock);
> -             ret = add_extent_mapping(em_tree, em);
> -             if (!ret)
> -                     list_move(&em->list,
> -                               &em_tree->modified_extents);
> +             ret = add_extent_mapping(em_tree, em, 1);
>               write_unlock(&em_tree->lock);
>       } while (ret == -EEXIST);

<snip>

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to