On Thu, May 16, 2013 at 01:34:11PM +0800, Miao Xie wrote: > On Thu, 16 May 2013 13:15:57 +0800, Liu Bo wrote: > > On Thu, May 16, 2013 at 12:31:11PM +0800, Miao Xie wrote: > >> On thu, 16 May 2013 11:36:46 +0800, Liu Bo wrote: > >>> On Wed, May 15, 2013 at 03:48:20PM +0800, Miao Xie wrote: > >>>> The grab/put funtions will be used in the next patch, which need grab > >>>> the root object and ensure it is not freed. We use reference counter > >>>> instead of the srcu lock is to aovid blocking the memory reclaim task, > >>>> which invokes synchronize_srcu(). > >>>> > >>> > >>> I don't think this is necessary, we put 'kfree(root)' because we really > >>> need to free them at the very end time, when there should be no inodes > >>> linking on the root(we should have cleaned all inodes out from it). > >>> > >>> So when we flush delalloc inodes and wait for ordered extents to finish, > >>> the root should be valid, otherwise someone is doing wrong things. > >>> > >>> And even with this grab_fs_root to avoid freeing root, it's just the > >>> root that remains in memory, all its attributes, like root->node, > >>> commit_root, root->inode_tree, are already NULL or empty. > >> > >> Please consider the case: > >> Task1 Task2 Cleaner > >> get the root > >> flush all delalloc inodes > >> drop subvolume > >> iput the last inode > >> move the root into the dead list > >> drop > >> subvolume > >> > >> kfree(root) > >> If Task1 accesses the root now, oops will happen. > > > > Then it's task1's fault, why it is not protected by subvol_srcu section when > > it's possible that someone like task2 sets root's refs to 0? > > > > synchronize_srcu(subvol_srcu) before adding root into dead root list is > > just for this race case, why do we need another? > > Please read my changelog.
'The memory reclaim task' in the changelog refers to iput -> inode_tree_del , right? I don't like special cases, this get/put is different from our usual way: if (atomic_dec_and_test(refs)) { kfree(A->a); kfree(A->b); kfree(A); } According to the pictured case, task1 may also access root->something. I must say that the patch itself looks harmless, the reason is not good enough. thanks, liubo > > Miao > > > > > thanks, > > liubo > > > >> > >> I introduced there two functions just to protect the access of the root > >> object, not its > >> attributes, so don't worry about the attributes. (Please see the first > >> sentence of the > >> changelog.) > >> > >> Thanks > >> Miao > >> > >>> > >>> thanks, > >>> liubo > >>> > >>>> Signed-off-by: Miao Xie <mi...@cn.fujitsu.com> > >>>> --- > >>>> fs/btrfs/ctree.h | 1 + > >>>> fs/btrfs/disk-io.c | 5 +++-- > >>>> fs/btrfs/disk-io.h | 21 +++++++++++++++++++++ > >>>> fs/btrfs/extent-tree.c | 2 +- > >>>> 4 files changed, 26 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h > >>>> index 845b77f..958ce6c 100644 > >>>> --- a/fs/btrfs/ctree.h > >>>> +++ b/fs/btrfs/ctree.h > >>>> @@ -1739,6 +1739,7 @@ struct btrfs_root { > >>>> int force_cow; > >>>> > >>>> spinlock_t root_item_lock; > >>>> + atomic_t refs; > >>>> }; > >>>> > >>>> struct btrfs_ioctl_defrag_range_args { > >>>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c > >>>> index 42d6ba2..642c861 100644 > >>>> --- a/fs/btrfs/disk-io.c > >>>> +++ b/fs/btrfs/disk-io.c > >>>> @@ -1216,6 +1216,7 @@ static void __setup_root(u32 nodesize, u32 > >>>> leafsize, u32 sectorsize, > >>>> atomic_set(&root->log_writers, 0); > >>>> atomic_set(&root->log_batch, 0); > >>>> atomic_set(&root->orphan_inodes, 0); > >>>> + atomic_set(&root->refs, 1); > >>>> root->log_transid = 0; > >>>> root->last_log_commit = 0; > >>>> extent_io_tree_init(&root->dirty_log_pages, > >>>> @@ -2049,7 +2050,7 @@ static void del_fs_roots(struct btrfs_fs_info > >>>> *fs_info) > >>>> } else { > >>>> free_extent_buffer(gang[0]->node); > >>>> free_extent_buffer(gang[0]->commit_root); > >>>> - kfree(gang[0]); > >>>> + btrfs_put_fs_root(gang[0]); > >>>> } > >>>> } > >>>> > >>>> @@ -3415,7 +3416,7 @@ static void free_fs_root(struct btrfs_root *root) > >>>> kfree(root->free_ino_ctl); > >>>> kfree(root->free_ino_pinned); > >>>> kfree(root->name); > >>>> - kfree(root); > >>>> + btrfs_put_fs_root(root); > >>>> } > >>>> > >>>> void btrfs_free_fs_root(struct btrfs_root *root) > >>>> diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h > >>>> index 534d583..b71acd6e 100644 > >>>> --- a/fs/btrfs/disk-io.h > >>>> +++ b/fs/btrfs/disk-io.h > >>>> @@ -76,6 +76,27 @@ void btrfs_btree_balance_dirty_nodelay(struct > >>>> btrfs_root *root); > >>>> void btrfs_drop_and_free_fs_root(struct btrfs_fs_info *fs_info, > >>>> struct btrfs_root *root); > >>>> void btrfs_free_fs_root(struct btrfs_root *root); > >>>> + > >>>> +/* > >>>> + * This function is used to grab the root, and avoid it is freed when we > >>>> + * access it. But it doesn't ensure that the tree is not dropped. > >>>> + * > >>>> + * If you want to ensure the whole tree is safe, you should use > >>>> + * fs_info->subvol_srcu > >>>> + */ > >>>> +static inline struct btrfs_root *btrfs_grab_fs_root(struct btrfs_root > >>>> *root) > >>>> +{ > >>>> + if (atomic_inc_not_zero(&root->refs)) > >>>> + return root; > >>>> + return NULL; > >>>> +} > >>>> + > >>>> +static inline void btrfs_put_fs_root(struct btrfs_root *root) > >>>> +{ > >>>> + if (atomic_dec_and_test(&root->refs)) > >>>> + kfree(root); > >>>> +} > >>>> + > >>>> void btrfs_mark_buffer_dirty(struct extent_buffer *buf); > >>>> int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid, > >>>> int atomic); > >>>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > >>>> index 08e42c8..08f9862 100644 > >>>> --- a/fs/btrfs/extent-tree.c > >>>> +++ b/fs/btrfs/extent-tree.c > >>>> @@ -7463,7 +7463,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, > >>>> } else { > >>>> free_extent_buffer(root->node); > >>>> free_extent_buffer(root->commit_root); > >>>> - kfree(root); > >>>> + btrfs_put_fs_root(root); > >>>> } > >>>> out_end_trans: > >>>> btrfs_end_transaction_throttle(trans, tree_root); > >>>> -- > >>>> 1.8.1.4 > >>>> > >>>> -- > >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>>> the body of a message to majord...@vger.kernel.org > >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> -- > >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >>> the body of a message to majord...@vger.kernel.org > >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > >>> > >> > > > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html