3.13.11-ckt25 -stable review patch.  If anyone has any objections, please let 
me know.

------------------

From: Filipe Manana <fdman...@suse.com>

commit ae9d8f17118551bedd797406a6768b87c2146234 upstream.

While the inode cache caching kthread is calling btrfs_unpin_free_ino(),
we could have a concurrent call to btrfs_return_ino() that adds a new
entry to the root's free space cache of pinned inodes. This concurrent
call does not acquire the fs_info->commit_root_sem before adding a new
entry if the caching state is BTRFS_CACHE_FINISHED, which is a problem
because the caching kthread calls btrfs_unpin_free_ino() after setting
the caching state to BTRFS_CACHE_FINISHED and therefore races with
the task calling btrfs_return_ino(), which is adding a new entry, while
the former (caching kthread) is navigating the cache's rbtree, removing
and freeing nodes from the cache's rbtree without acquiring the spinlock
that protects the rbtree.

This race resulted in memory corruption due to double free of struct
btrfs_free_space objects because both tasks can end up doing freeing the
same objects. Note that adding a new entry can result in merging it with
other entries in the cache, in which case those entries are freed.
This is particularly important as btrfs_free_space structures are also
used for the block group free space caches.

This memory corruption can be detected by a debugging kernel, which
reports it with the following trace:

[132408.501148] slab error in verify_redzone_free(): cache `btrfs_free_space': 
double free detected
[132408.505075] CPU: 15 PID: 12248 Comm: btrfs-ino-cache Tainted: G        W    
   4.1.0-rc5-btrfs-next-10+ #1
[132408.505075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
[132408.505075]  ffff880023e7d320 ffff880163d73cd8 ffffffff8145eec7 
ffffffff81095dce
[132408.505075]  ffff880009735d40 ffff880163d73ce8 ffffffff81154e1e 
ffff880163d73d68
[132408.505075]  ffffffff81155733 ffffffffa054a95a ffff8801b6099f00 
ffffffffa0505b5f
[132408.505075] Call Trace:
[132408.505075]  [<ffffffff8145eec7>] dump_stack+0x4f/0x7b
[132408.505075]  [<ffffffff81095dce>] ? console_unlock+0x356/0x3a2
[132408.505075]  [<ffffffff81154e1e>] __slab_error.isra.28+0x25/0x36
[132408.505075]  [<ffffffff81155733>] __cache_free+0xe2/0x4b6
[132408.505075]  [<ffffffffa054a95a>] ? __btrfs_add_free_space+0x2f0/0x343 
[btrfs]
[132408.505075]  [<ffffffffa0505b5f>] ? btrfs_unpin_free_ino+0x8e/0x99 [btrfs]
[132408.505075]  [<ffffffff810f3b30>] ? time_hardirqs_off+0x15/0x28
[132408.505075]  [<ffffffff81084d42>] ? trace_hardirqs_off+0xd/0xf
[132408.505075]  [<ffffffff811563a1>] ? kfree+0xb6/0x14e
[132408.505075]  [<ffffffff811563d0>] kfree+0xe5/0x14e
[132408.505075]  [<ffffffffa0505b5f>] btrfs_unpin_free_ino+0x8e/0x99 [btrfs]
[132408.505075]  [<ffffffffa0505e08>] caching_kthread+0x29e/0x2d9 [btrfs]
[132408.505075]  [<ffffffffa0505b6a>] ? btrfs_unpin_free_ino+0x99/0x99 [btrfs]
[132408.505075]  [<ffffffff8106698f>] kthread+0xef/0xf7
[132408.505075]  [<ffffffff810f3b08>] ? time_hardirqs_on+0x15/0x28
[132408.505075]  [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad
[132408.505075]  [<ffffffff814653d2>] ret_from_fork+0x42/0x70
[132408.505075]  [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad
[132408.505075] ffff880023e7d320: redzone 1:0x9f911029d74e35b, redzone 
2:0x9f911029d74e35b.
[132409.501654] slab: double free detected in cache 'btrfs_free_space', objp 
ffff880023e7d320
[132409.503355] ------------[ cut here ]------------
[132409.504241] kernel BUG at mm/slab.c:2571!

Therefore fix this by having btrfs_unpin_free_ino() acquire the lock
that protects the rbtree while doing the searches and removing entries.

Fixes: 1c70d8fb4dfa ("Btrfs: fix inode caching vs tree log")
Signed-off-by: Filipe Manana <fdman...@suse.com>
Signed-off-by: Chris Mason <c...@fb.com>
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriq...@canonical.com>

Signed-off-by: Kamal Mostafa <ka...@canonical.com>
---
 fs/btrfs/inode-map.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/inode-map.c b/fs/btrfs/inode-map.c
index 9f8c7c7..cb5c24c 100644
--- a/fs/btrfs/inode-map.c
+++ b/fs/btrfs/inode-map.c
@@ -242,6 +242,7 @@ void btrfs_unpin_free_ino(struct btrfs_root *root)
 {
        struct btrfs_free_space_ctl *ctl = root->free_ino_ctl;
        struct rb_root *rbroot = &root->free_ino_pinned->free_space_offset;
+       spinlock_t *rbroot_lock = &root->free_ino_pinned->tree_lock;
        struct btrfs_free_space *info;
        struct rb_node *n;
        u64 count;
@@ -250,23 +251,29 @@ void btrfs_unpin_free_ino(struct btrfs_root *root)
                return;
 
        while (1) {
+               bool add_to_ctl = true;
+
+               spin_lock(rbroot_lock);
                n = rb_first(rbroot);
-               if (!n)
+               if (!n) {
+                       spin_unlock(rbroot_lock);
                        break;
+               }
 
                info = rb_entry(n, struct btrfs_free_space, offset_index);
                BUG_ON(info->bitmap); /* Logic error */
 
                if (info->offset > root->cache_progress)
-                       goto free;
+                       add_to_ctl = false;
                else if (info->offset + info->bytes > root->cache_progress)
                        count = root->cache_progress - info->offset + 1;
                else
                        count = info->bytes;
 
-               __btrfs_add_free_space(ctl, info->offset, count);
-free:
                rb_erase(&info->offset_index, rbroot);
+               spin_unlock(rbroot_lock);
+               if (add_to_ctl)
+                       __btrfs_add_free_space(ctl, info->offset, count);
                kmem_cache_free(btrfs_free_space_cachep, info);
        }
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to