Re: [PATCH 4.4 017/268] do d_instantiate/unlock_new_inode combinations safely
On Mon, 2018-05-28 at 11:59 +0200, Greg Kroah-Hartman wrote: > 4.4-stable review patch. If anyone has any objections, please let me know. > > -- > > From: Al Viro > > commit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream. > > For anything NFS-exported we do _not_ want to unlock new inode > before it has grown an alias; original set of fixes got the > ordering right, but missed the nasty complication in case of > lockdep being enabled - unlock_new_inode() does > lockdep_annotate_inode_mutex_key(inode) > which can only be done before anyone gets a chance to touch > ->i_mutex. Unfortunately, flipping the order and doing > unlock_new_inode() before d_instantiate() opens a window when > mkdir can race with open-by-fhandle on a guessed fhandle, leading > to multiple aliases for a directory inode and all the breakage > that follows from that. > > Correct solution: a new primitive (d_instantiate_new()) > combining these two in the right order - lockdep annotate, then > d_instantiate(), then the rest of unlock_new_inode(). All > combinations of d_instantiate() with unlock_new_inode() should > be converted to that. [...] I think you missed xfs, which has a wrapper around unlock_new_inode() called xfs_finish_inode_setup(). It looks like xfs_generic_create() and xfs_vn_symlink() still need this conversion. Ben. -- Ben Hutchings, Software Developer Codethink Ltd https://www.codethink.co.uk/ Dale House, 35 Dale Street Manchester, M1 2HF, United Kingdom
Re: [PATCH 4.4 017/268] do d_instantiate/unlock_new_inode combinations safely
On Mon, 2018-05-28 at 11:59 +0200, Greg Kroah-Hartman wrote: > 4.4-stable review patch. If anyone has any objections, please let me know. > > -- > > From: Al Viro > > commit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream. > > For anything NFS-exported we do _not_ want to unlock new inode > before it has grown an alias; original set of fixes got the > ordering right, but missed the nasty complication in case of > lockdep being enabled - unlock_new_inode() does > lockdep_annotate_inode_mutex_key(inode) > which can only be done before anyone gets a chance to touch > ->i_mutex. Unfortunately, flipping the order and doing > unlock_new_inode() before d_instantiate() opens a window when > mkdir can race with open-by-fhandle on a guessed fhandle, leading > to multiple aliases for a directory inode and all the breakage > that follows from that. > > Correct solution: a new primitive (d_instantiate_new()) > combining these two in the right order - lockdep annotate, then > d_instantiate(), then the rest of unlock_new_inode(). All > combinations of d_instantiate() with unlock_new_inode() should > be converted to that. [...] I think you missed xfs, which has a wrapper around unlock_new_inode() called xfs_finish_inode_setup(). It looks like xfs_generic_create() and xfs_vn_symlink() still need this conversion. Ben. -- Ben Hutchings, Software Developer Codethink Ltd https://www.codethink.co.uk/ Dale House, 35 Dale Street Manchester, M1 2HF, United Kingdom
[PATCH 4.4 017/268] do d_instantiate/unlock_new_inode combinations safely
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Al Virocommit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream. For anything NFS-exported we do _not_ want to unlock new inode before it has grown an alias; original set of fixes got the ordering right, but missed the nasty complication in case of lockdep being enabled - unlock_new_inode() does lockdep_annotate_inode_mutex_key(inode) which can only be done before anyone gets a chance to touch ->i_mutex. Unfortunately, flipping the order and doing unlock_new_inode() before d_instantiate() opens a window when mkdir can race with open-by-fhandle on a guessed fhandle, leading to multiple aliases for a directory inode and all the breakage that follows from that. Correct solution: a new primitive (d_instantiate_new()) combining these two in the right order - lockdep annotate, then d_instantiate(), then the rest of unlock_new_inode(). All combinations of d_instantiate() with unlock_new_inode() should be converted to that. Cc: sta...@kernel.org # 2.6.29 and later Tested-by: Mike Marshall Reviewed-by: Andreas Dilger Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/inode.c | 16 fs/dcache.c| 22 ++ fs/ecryptfs/inode.c|3 +-- fs/ext2/namei.c|6 ++ fs/ext4/namei.c|6 ++ fs/f2fs/namei.c| 12 fs/jffs2/dir.c | 12 fs/jfs/namei.c | 12 fs/nilfs2/namei.c |6 ++ fs/reiserfs/namei.c| 12 fs/udf/namei.c |6 ++ fs/ufs/namei.c |6 ++ include/linux/dcache.h |1 + 13 files changed, 54 insertions(+), 66 deletions(-) --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6413,8 +6413,7 @@ static int btrfs_mknod(struct inode *dir goto out_unlock_inode; } else { btrfs_update_inode(trans, root, inode); - unlock_new_inode(inode); - d_instantiate(dentry, inode); + d_instantiate_new(dentry, inode); } out_unlock: @@ -6489,8 +6488,7 @@ static int btrfs_create(struct inode *di goto out_unlock_inode; BTRFS_I(inode)->io_tree.ops = _extent_io_ops; - unlock_new_inode(inode); - d_instantiate(dentry, inode); + d_instantiate_new(dentry, inode); out_unlock: btrfs_end_transaction(trans, root); @@ -6633,12 +6631,7 @@ static int btrfs_mkdir(struct inode *dir if (err) goto out_fail_inode; - d_instantiate(dentry, inode); - /* -* mkdir is special. We're unlocking after we call d_instantiate -* to avoid a race with nfsd calling d_instantiate. -*/ - unlock_new_inode(inode); + d_instantiate_new(dentry, inode); drop_on_err = 0; out_fail: @@ -9789,8 +9782,7 @@ static int btrfs_symlink(struct inode *d goto out_unlock_inode; } - unlock_new_inode(inode); - d_instantiate(dentry, inode); + d_instantiate_new(dentry, inode); out_unlock: btrfs_end_transaction(trans, root); --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1897,6 +1897,28 @@ struct dentry *d_instantiate_unique(stru EXPORT_SYMBOL(d_instantiate_unique); +/* + * This should be equivalent to d_instantiate() + unlock_new_inode(), + * with lockdep-related part of unlock_new_inode() done before + * anything else. Use that instead of open-coding d_instantiate()/ + * unlock_new_inode() combinations. + */ +void d_instantiate_new(struct dentry *entry, struct inode *inode) +{ + BUG_ON(!hlist_unhashed(>d_u.d_alias)); + BUG_ON(!inode); + lockdep_annotate_inode_mutex_key(inode); + security_d_instantiate(entry, inode); + spin_lock(>i_lock); + __d_instantiate(entry, inode); + WARN_ON(!(inode->i_state & I_NEW)); + inode->i_state &= ~I_NEW; + smp_mb(); + wake_up_bit(>i_state, __I_NEW); + spin_unlock(>i_lock); +} +EXPORT_SYMBOL(d_instantiate_new); + /** * d_instantiate_no_diralias - instantiate a non-aliased dentry * @entry: dentry to complete --- a/fs/ecryptfs/inode.c +++ b/fs/ecryptfs/inode.c @@ -287,8 +287,7 @@ ecryptfs_create(struct inode *directory_ iput(ecryptfs_inode); goto out; } - unlock_new_inode(ecryptfs_inode); - d_instantiate(ecryptfs_dentry, ecryptfs_inode); + d_instantiate_new(ecryptfs_dentry, ecryptfs_inode); out: return rc; } --- a/fs/ext2/namei.c +++ b/fs/ext2/namei.c @@ -40,8 +40,7 @@ static inline int ext2_add_nondir(struct { int err = ext2_add_link(dentry, inode); if (!err) { - unlock_new_inode(inode); - d_instantiate(dentry, inode); +
[PATCH 4.4 017/268] do d_instantiate/unlock_new_inode combinations safely
4.4-stable review patch. If anyone has any objections, please let me know. -- From: Al Viro commit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream. For anything NFS-exported we do _not_ want to unlock new inode before it has grown an alias; original set of fixes got the ordering right, but missed the nasty complication in case of lockdep being enabled - unlock_new_inode() does lockdep_annotate_inode_mutex_key(inode) which can only be done before anyone gets a chance to touch ->i_mutex. Unfortunately, flipping the order and doing unlock_new_inode() before d_instantiate() opens a window when mkdir can race with open-by-fhandle on a guessed fhandle, leading to multiple aliases for a directory inode and all the breakage that follows from that. Correct solution: a new primitive (d_instantiate_new()) combining these two in the right order - lockdep annotate, then d_instantiate(), then the rest of unlock_new_inode(). All combinations of d_instantiate() with unlock_new_inode() should be converted to that. Cc: sta...@kernel.org # 2.6.29 and later Tested-by: Mike Marshall Reviewed-by: Andreas Dilger Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/inode.c | 16 fs/dcache.c| 22 ++ fs/ecryptfs/inode.c|3 +-- fs/ext2/namei.c|6 ++ fs/ext4/namei.c|6 ++ fs/f2fs/namei.c| 12 fs/jffs2/dir.c | 12 fs/jfs/namei.c | 12 fs/nilfs2/namei.c |6 ++ fs/reiserfs/namei.c| 12 fs/udf/namei.c |6 ++ fs/ufs/namei.c |6 ++ include/linux/dcache.h |1 + 13 files changed, 54 insertions(+), 66 deletions(-) --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -6413,8 +6413,7 @@ static int btrfs_mknod(struct inode *dir goto out_unlock_inode; } else { btrfs_update_inode(trans, root, inode); - unlock_new_inode(inode); - d_instantiate(dentry, inode); + d_instantiate_new(dentry, inode); } out_unlock: @@ -6489,8 +6488,7 @@ static int btrfs_create(struct inode *di goto out_unlock_inode; BTRFS_I(inode)->io_tree.ops = _extent_io_ops; - unlock_new_inode(inode); - d_instantiate(dentry, inode); + d_instantiate_new(dentry, inode); out_unlock: btrfs_end_transaction(trans, root); @@ -6633,12 +6631,7 @@ static int btrfs_mkdir(struct inode *dir if (err) goto out_fail_inode; - d_instantiate(dentry, inode); - /* -* mkdir is special. We're unlocking after we call d_instantiate -* to avoid a race with nfsd calling d_instantiate. -*/ - unlock_new_inode(inode); + d_instantiate_new(dentry, inode); drop_on_err = 0; out_fail: @@ -9789,8 +9782,7 @@ static int btrfs_symlink(struct inode *d goto out_unlock_inode; } - unlock_new_inode(inode); - d_instantiate(dentry, inode); + d_instantiate_new(dentry, inode); out_unlock: btrfs_end_transaction(trans, root); --- a/fs/dcache.c +++ b/fs/dcache.c @@ -1897,6 +1897,28 @@ struct dentry *d_instantiate_unique(stru EXPORT_SYMBOL(d_instantiate_unique); +/* + * This should be equivalent to d_instantiate() + unlock_new_inode(), + * with lockdep-related part of unlock_new_inode() done before + * anything else. Use that instead of open-coding d_instantiate()/ + * unlock_new_inode() combinations. + */ +void d_instantiate_new(struct dentry *entry, struct inode *inode) +{ + BUG_ON(!hlist_unhashed(>d_u.d_alias)); + BUG_ON(!inode); + lockdep_annotate_inode_mutex_key(inode); + security_d_instantiate(entry, inode); + spin_lock(>i_lock); + __d_instantiate(entry, inode); + WARN_ON(!(inode->i_state & I_NEW)); + inode->i_state &= ~I_NEW; + smp_mb(); + wake_up_bit(>i_state, __I_NEW); + spin_unlock(>i_lock); +} +EXPORT_SYMBOL(d_instantiate_new); + /** * d_instantiate_no_diralias - instantiate a non-aliased dentry * @entry: dentry to complete --- a/fs/ecryptfs/inode.c +++ b/fs/ecryptfs/inode.c @@ -287,8 +287,7 @@ ecryptfs_create(struct inode *directory_ iput(ecryptfs_inode); goto out; } - unlock_new_inode(ecryptfs_inode); - d_instantiate(ecryptfs_dentry, ecryptfs_inode); + d_instantiate_new(ecryptfs_dentry, ecryptfs_inode); out: return rc; } --- a/fs/ext2/namei.c +++ b/fs/ext2/namei.c @@ -40,8 +40,7 @@ static inline int ext2_add_nondir(struct { int err = ext2_add_link(dentry, inode); if (!err) { - unlock_new_inode(inode); - d_instantiate(dentry, inode); + d_instantiate_new(dentry, inode); return 0; } inode_dec_link_count(inode); @@