Jianyong, Jianyong Wu wrote on Wed, Sep 23, 2020: > Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall > bug in 9p. But there is race issue in fid comtention. > As Greg's patch stores all of fids from opened files into according inode, > so all the lookup fid ops can retrieve fid from inode preferentially. But > there is no mechanism to handle the fid comtention issue. For example, > there are two threads get the same fid in the same time and one of them > clunk the fid before the other thread ready to discard the fid. In this > scenario, it will lead to some fatal problems, even kernel core dump. > > I introduce a mechanism to fix this race issue. A counter field introduced > into p9_fid struct to store the reference counter to the fid. When a fid > is allocated from the inode or dentry, the counter will increase, and > will decrease at the end of its occupation. It is guaranteed that the > fid won't be clunked before the reference counter go down to 0, then > we can avoid the clunked fid to be used. > > tests: > race issue test from the old test case: > for file in {01..50}; do touch f.${file}; done > seq 1 1000 | xargs -n 1 -P 50 -I{} cat f.* > /dev/null > > open-unlink-f*syscall test: > I have tested for f*syscall include: ftruncate fstat fchown fchmod faccessat. > > Signed-off-by: Jianyong Wu <jianyong...@arm.com>
Ok so I've finally taken some time to test -- sorry for the delay. I didn't bother with qemu but the use-after-close f* calls work with nfs-ganesha and it doesn't introduce any obvious regression with the current qemu either, so this is good for me. Greg, Christian - from what I understood (in private, hopefully I'm allowed to repeat!), he won't be able to contribute to qemu because of company policies and I'm unlikely to take the time either right now. I don't think it's a problem to continue as is though, we can land linux kernel support (it's still useful for non-qemu servers) and if someone is interested later on they'll just need to finish that bit. I'm not seeing any obvious problem now so I'll take these patches in -next within the next few days, with this extra patch on top that basically applies the requests I had: - removing the extra atomic_set in fs/9p/fid.c - convert from atomic to refcount API (overflow checks) - rename a no-composant walk to clone_fid() I'll just run some more instrumented tests first to check we're not leaking anything but so far I haven't found any problem. If you noticed anything else please speak up. Thanks for taking the time to finish this! :) --- fs/9p/fid.c | 10 ++++------ fs/9p/fid.h | 2 +- include/net/9p/client.h | 2 +- net/9p/client.c | 4 ++-- 4 files changed, 8 insertions(+), 10 deletions(-) diff --git a/fs/9p/fid.c b/fs/9p/fid.c index 89643dabcdae..50118ec72a92 100644 --- a/fs/9p/fid.c +++ b/fs/9p/fid.c @@ -28,7 +28,6 @@ static inline void __add_fid(struct dentry *dentry, struct p9_fid *fid) { - atomic_set(&fid->count, 1); hlist_add_head(&fid->dlist, (struct hlist_head *)&dentry->d_fsdata); } @@ -62,7 +61,7 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid) } } if (ret && !IS_ERR(ret)) - atomic_inc(&ret->count); + refcount_inc(&ret->count); spin_unlock(&inode->i_lock); return ret; } @@ -77,7 +76,6 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid) void v9fs_open_fid_add(struct inode *inode, struct p9_fid *fid) { spin_lock(&inode->i_lock); - atomic_set(&fid->count, 1); hlist_add_head(&fid->ilist, (struct hlist_head *)&inode->i_private); spin_unlock(&inode->i_lock); } @@ -110,7 +108,7 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any) hlist_for_each_entry(fid, h, dlist) { if (any || uid_eq(fid->uid, uid)) { ret = fid; - atomic_inc(&ret->count); + refcount_inc(&ret->count); break; } } @@ -201,7 +199,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry, } /* If we are root ourself just return that */ if (dentry->d_sb->s_root == dentry) { - atomic_inc(&fid->count); + refcount_inc(&fid->count); return fid; } /* @@ -250,7 +248,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry, fid = ERR_PTR(-ENOENT); } else { __add_fid(dentry, fid); - atomic_inc(&fid->count); + refcount_inc(&fid->count); spin_unlock(&dentry->d_lock); } } diff --git a/fs/9p/fid.h b/fs/9p/fid.h index 1fed96546728..f7f33509e169 100644 --- a/fs/9p/fid.h +++ b/fs/9p/fid.h @@ -28,7 +28,7 @@ static inline struct p9_fid *v9fs_fid_clone(struct dentry *dentry) if (!fid || IS_ERR(fid)) return fid; - nfid = p9_client_walk(fid, 0, NULL, 1); + nfid = clone_fid(fid); p9_client_clunk(fid); return nfid; } diff --git a/include/net/9p/client.h b/include/net/9p/client.h index 58ed9bd306bd..e1c308d8d288 100644 --- a/include/net/9p/client.h +++ b/include/net/9p/client.h @@ -149,7 +149,7 @@ enum fid_source { struct p9_fid { struct p9_client *clnt; u32 fid; - atomic_t count; + refcount_t count; int mode; struct p9_qid qid; u32 iounit; diff --git a/net/9p/client.c b/net/9p/client.c index a6c8a915e0d8..ba4910138c5b 100644 --- a/net/9p/client.c +++ b/net/9p/client.c @@ -901,7 +901,7 @@ static struct p9_fid *p9_fid_create(struct p9_client *clnt) fid->clnt = clnt; fid->rdir = NULL; fid->fid = 0; - atomic_set(&fid->count, 1); + refcount_set(&fid->count, 1); idr_preload(GFP_KERNEL); spin_lock_irq(&clnt->lock); @@ -1466,7 +1466,7 @@ int p9_client_clunk(struct p9_fid *fid) dump_stack(); return 0; } - if (!atomic_dec_and_test(&fid->count)) + if (!refcount_dec_and_test(&fid->count)) return 0; again: