On 12.11.2020 18:39, Pavel Tikhomirov wrote: > > > On 11/12/20 6:17 PM, Kirill Tkhai wrote: >> On 12.11.2020 18:08, Andrey Zhadchenko wrote: >>> On Thu, 12 Nov 2020 17:29:21 +0300 >>> Kirill Tkhai <ktk...@virtuozzo.com> wrote: >>> >>> Hi, Kirill, >>> >>>> Hi, Andrey, >>>> >>>> On 11.11.2020 10:32, Andrey Zhadchenko wrote: >>>>> Use more generic igrab instead of atomic inc. Move cgroup_hash_del >>>>> to eviction stage to avoid deadlock. >>>>> >>>>> Signed-off-by: Andrey Zhadchenko <andrey.zhadche...@virtuozzo.com> >>>>> --- >>>>> >>>>> v2: adjusted function call order in cgroup_evict_inode to match >>>>> existing code >>>>> >>>>> kernel/cgroup.c | 25 ++++++++----------------- >>>>> 1 file changed, 8 insertions(+), 17 deletions(-) >>>>> >>>>> diff --git a/kernel/cgroup.c b/kernel/cgroup.c >>>>> index 27d7a5e..8c2cef8 100644 >>>>> --- a/kernel/cgroup.c >>>>> +++ b/kernel/cgroup.c >>>>> @@ -1522,21 +1522,10 @@ static struct inode >>>>> *cgroup_find_inode(unsigned long fh[2], char take_ref) struct inode >>>>> *ret = NULL; >>>>> spin_lock(&cgroup_inode_table_lock); >>>>> - item = cgroup_find_item_no_lock(fh); >>>>> - /* >>>>> - * If we need to increase refcount, we should be aware of >>>>> possible >>>>> - * deadlock. Another thread may have started deleting this >>>>> inode: >>>>> - * iput->iput_final->cgroup_delete_inode->cgroup_hash_del >>>>> - * If we just call igrab, it will try to take i_lock and >>>>> this will >>>>> - * result in deadlock, because deleting thread has already >>>>> taken it >>>>> - * and waits on cgroup_inode_table_lock to find inode in >>>>> hashtable. >>>>> - * >>>>> - * If i_count is zero, someone is deleting it -> skip. >>>>> - */ >>>>> - if (take_ref && item) >>>>> - if (!atomic_inc_not_zero(&item->inode->i_count)) >>>>> - item = NULL; >>>>> + item = cgroup_find_item_no_lock(fh); >>>>> + if (item && take_ref && !igrab(item->inode)) >>>>> + item = NULL; >>>> >>>> Here you call igrab() under cgroup_inode_table_lock, so the check for >>>> (inode->i_state & I_FREEING|I_WILL_FREE) is done under that lock and >>>> i_lock. >>>> >>>> But clear_inode() sets "inode->i_state = I_FREEING | I_CLEAR" without >>>> any lock. This place does not look obvious. Isn't there some problem? >>>> >>> >>> iput_final() sets inode to I_FREEING under i_lock, release it and calls >>> evict(), which calls cgroup_evict_inode() via >>> inode->i_sb->s_op->evict(). >> >> What is about inodes in LRU? > > I thought iput_final:drop is always true for cgroupfs. And there is no lru.
That is OK. >> >> Say, iput_final() moves inode to LRU with zero counter. I_FREEING is not >> set there. Later prune_icache_sb() operates with that inode with zero >> counter. >> >> Doesn't dispose_list()->evict()->cgroup_evict_inode() race with >> cgroup_find_inode()? >> >>> If I get it right, this seems to be fine. As soon as iput_final starts >>> working igrab shouldn't be able to take ref on this inode. >>> >>>>> spin_unlock(&cgroup_inode_table_lock); >>>>> @@ -1634,15 +1623,17 @@ static const struct export_operations >>>>> cgroup_export_ops = { .fh_to_dentry = cgroup_fh_to_dentry, >>>>> }; >>>>> -static int cgroup_delete_inode(struct inode *inode) >>>>> +static void cgroup_evict_inode(struct inode *inode) >>>>> { >>>>> + truncate_inode_pages_final(&inode->i_data); >>>>> + clear_inode(inode); >>>>> cgroup_hash_del(inode); >>>>> - return generic_delete_inode(inode); >>>>> } >>>>> static const struct super_operations cgroup_ops = { >>>>> .statfs = simple_statfs, >>>>> - .drop_inode = cgroup_delete_inode, >>>>> + .drop_inode = generic_delete_inode, >>>>> + .evict_inode = cgroup_evict_inode, >>>>> .show_options = cgroup_show_options, >>>>> #ifdef CONFIG_VE >>>>> .show_path = cgroup_show_path, >>>>> >>>> >>> >> > _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel