Re: [PATCH 4.4 017/268] do d_instantiate/unlock_new_inode combinations safely

2018-06-08 Thread Ben Hutchings
On Mon, 2018-05-28 at 11:59 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Al Viro 
> 
> commit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream.
> 
> For anything NFS-exported we do _not_ want to unlock new inode
> before it has grown an alias; original set of fixes got the
> ordering right, but missed the nasty complication in case of
> lockdep being enabled - unlock_new_inode() does
>   lockdep_annotate_inode_mutex_key(inode)
> which can only be done before anyone gets a chance to touch
> ->i_mutex.  Unfortunately, flipping the order and doing
> unlock_new_inode() before d_instantiate() opens a window when
> mkdir can race with open-by-fhandle on a guessed fhandle, leading
> to multiple aliases for a directory inode and all the breakage
> that follows from that.
> 
>   Correct solution: a new primitive (d_instantiate_new())
> combining these two in the right order - lockdep annotate, then
> d_instantiate(), then the rest of unlock_new_inode().  All
> combinations of d_instantiate() with unlock_new_inode() should
> be converted to that.
[...]

I think you missed xfs, which has a wrapper around unlock_new_inode()
called xfs_finish_inode_setup().  It looks like xfs_generic_create()
and xfs_vn_symlink() still need this conversion.

Ben.

-- 
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
 Manchester, M1 2HF, United Kingdom


Re: [PATCH 4.4 017/268] do d_instantiate/unlock_new_inode combinations safely

2018-06-08 Thread Ben Hutchings
On Mon, 2018-05-28 at 11:59 +0200, Greg Kroah-Hartman wrote:
> 4.4-stable review patch.  If anyone has any objections, please let me know.
> 
> --
> 
> From: Al Viro 
> 
> commit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream.
> 
> For anything NFS-exported we do _not_ want to unlock new inode
> before it has grown an alias; original set of fixes got the
> ordering right, but missed the nasty complication in case of
> lockdep being enabled - unlock_new_inode() does
>   lockdep_annotate_inode_mutex_key(inode)
> which can only be done before anyone gets a chance to touch
> ->i_mutex.  Unfortunately, flipping the order and doing
> unlock_new_inode() before d_instantiate() opens a window when
> mkdir can race with open-by-fhandle on a guessed fhandle, leading
> to multiple aliases for a directory inode and all the breakage
> that follows from that.
> 
>   Correct solution: a new primitive (d_instantiate_new())
> combining these two in the right order - lockdep annotate, then
> d_instantiate(), then the rest of unlock_new_inode().  All
> combinations of d_instantiate() with unlock_new_inode() should
> be converted to that.
[...]

I think you missed xfs, which has a wrapper around unlock_new_inode()
called xfs_finish_inode_setup().  It looks like xfs_generic_create()
and xfs_vn_symlink() still need this conversion.

Ben.

-- 
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
 Manchester, M1 2HF, United Kingdom


[PATCH 4.4 017/268] do d_instantiate/unlock_new_inode combinations safely

2018-05-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Al Viro 

commit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream.

For anything NFS-exported we do _not_ want to unlock new inode
before it has grown an alias; original set of fixes got the
ordering right, but missed the nasty complication in case of
lockdep being enabled - unlock_new_inode() does
lockdep_annotate_inode_mutex_key(inode)
which can only be done before anyone gets a chance to touch
->i_mutex.  Unfortunately, flipping the order and doing
unlock_new_inode() before d_instantiate() opens a window when
mkdir can race with open-by-fhandle on a guessed fhandle, leading
to multiple aliases for a directory inode and all the breakage
that follows from that.

Correct solution: a new primitive (d_instantiate_new())
combining these two in the right order - lockdep annotate, then
d_instantiate(), then the rest of unlock_new_inode().  All
combinations of d_instantiate() with unlock_new_inode() should
be converted to that.

Cc: sta...@kernel.org   # 2.6.29 and later
Tested-by: Mike Marshall 
Reviewed-by: Andreas Dilger 
Signed-off-by: Al Viro 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/btrfs/inode.c   |   16 
 fs/dcache.c|   22 ++
 fs/ecryptfs/inode.c|3 +--
 fs/ext2/namei.c|6 ++
 fs/ext4/namei.c|6 ++
 fs/f2fs/namei.c|   12 
 fs/jffs2/dir.c |   12 
 fs/jfs/namei.c |   12 
 fs/nilfs2/namei.c  |6 ++
 fs/reiserfs/namei.c|   12 
 fs/udf/namei.c |6 ++
 fs/ufs/namei.c |6 ++
 include/linux/dcache.h |1 +
 13 files changed, 54 insertions(+), 66 deletions(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6413,8 +6413,7 @@ static int btrfs_mknod(struct inode *dir
goto out_unlock_inode;
} else {
btrfs_update_inode(trans, root, inode);
-   unlock_new_inode(inode);
-   d_instantiate(dentry, inode);
+   d_instantiate_new(dentry, inode);
}
 
 out_unlock:
@@ -6489,8 +6488,7 @@ static int btrfs_create(struct inode *di
goto out_unlock_inode;
 
BTRFS_I(inode)->io_tree.ops = _extent_io_ops;
-   unlock_new_inode(inode);
-   d_instantiate(dentry, inode);
+   d_instantiate_new(dentry, inode);
 
 out_unlock:
btrfs_end_transaction(trans, root);
@@ -6633,12 +6631,7 @@ static int btrfs_mkdir(struct inode *dir
if (err)
goto out_fail_inode;
 
-   d_instantiate(dentry, inode);
-   /*
-* mkdir is special.  We're unlocking after we call d_instantiate
-* to avoid a race with nfsd calling d_instantiate.
-*/
-   unlock_new_inode(inode);
+   d_instantiate_new(dentry, inode);
drop_on_err = 0;
 
 out_fail:
@@ -9789,8 +9782,7 @@ static int btrfs_symlink(struct inode *d
goto out_unlock_inode;
}
 
-   unlock_new_inode(inode);
-   d_instantiate(dentry, inode);
+   d_instantiate_new(dentry, inode);
 
 out_unlock:
btrfs_end_transaction(trans, root);
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1897,6 +1897,28 @@ struct dentry *d_instantiate_unique(stru
 
 EXPORT_SYMBOL(d_instantiate_unique);
 
+/*
+ * This should be equivalent to d_instantiate() + unlock_new_inode(),
+ * with lockdep-related part of unlock_new_inode() done before
+ * anything else.  Use that instead of open-coding d_instantiate()/
+ * unlock_new_inode() combinations.
+ */
+void d_instantiate_new(struct dentry *entry, struct inode *inode)
+{
+   BUG_ON(!hlist_unhashed(>d_u.d_alias));
+   BUG_ON(!inode);
+   lockdep_annotate_inode_mutex_key(inode);
+   security_d_instantiate(entry, inode);
+   spin_lock(>i_lock);
+   __d_instantiate(entry, inode);
+   WARN_ON(!(inode->i_state & I_NEW));
+   inode->i_state &= ~I_NEW;
+   smp_mb();
+   wake_up_bit(>i_state, __I_NEW);
+   spin_unlock(>i_lock);
+}
+EXPORT_SYMBOL(d_instantiate_new);
+
 /**
  * d_instantiate_no_diralias - instantiate a non-aliased dentry
  * @entry: dentry to complete
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -287,8 +287,7 @@ ecryptfs_create(struct inode *directory_
iput(ecryptfs_inode);
goto out;
}
-   unlock_new_inode(ecryptfs_inode);
-   d_instantiate(ecryptfs_dentry, ecryptfs_inode);
+   d_instantiate_new(ecryptfs_dentry, ecryptfs_inode);
 out:
return rc;
 }
--- a/fs/ext2/namei.c
+++ b/fs/ext2/namei.c
@@ -40,8 +40,7 @@ static inline int ext2_add_nondir(struct
 {
int err = ext2_add_link(dentry, inode);
if (!err) {
-   unlock_new_inode(inode);
-   d_instantiate(dentry, inode);
+

[PATCH 4.4 017/268] do d_instantiate/unlock_new_inode combinations safely

2018-05-28 Thread Greg Kroah-Hartman
4.4-stable review patch.  If anyone has any objections, please let me know.

--

From: Al Viro 

commit 1e2e547a93a00ebc21582c06ca3c6cfea2a309ee upstream.

For anything NFS-exported we do _not_ want to unlock new inode
before it has grown an alias; original set of fixes got the
ordering right, but missed the nasty complication in case of
lockdep being enabled - unlock_new_inode() does
lockdep_annotate_inode_mutex_key(inode)
which can only be done before anyone gets a chance to touch
->i_mutex.  Unfortunately, flipping the order and doing
unlock_new_inode() before d_instantiate() opens a window when
mkdir can race with open-by-fhandle on a guessed fhandle, leading
to multiple aliases for a directory inode and all the breakage
that follows from that.

Correct solution: a new primitive (d_instantiate_new())
combining these two in the right order - lockdep annotate, then
d_instantiate(), then the rest of unlock_new_inode().  All
combinations of d_instantiate() with unlock_new_inode() should
be converted to that.

Cc: sta...@kernel.org   # 2.6.29 and later
Tested-by: Mike Marshall 
Reviewed-by: Andreas Dilger 
Signed-off-by: Al Viro 
Signed-off-by: Greg Kroah-Hartman 

---
 fs/btrfs/inode.c   |   16 
 fs/dcache.c|   22 ++
 fs/ecryptfs/inode.c|3 +--
 fs/ext2/namei.c|6 ++
 fs/ext4/namei.c|6 ++
 fs/f2fs/namei.c|   12 
 fs/jffs2/dir.c |   12 
 fs/jfs/namei.c |   12 
 fs/nilfs2/namei.c  |6 ++
 fs/reiserfs/namei.c|   12 
 fs/udf/namei.c |6 ++
 fs/ufs/namei.c |6 ++
 include/linux/dcache.h |1 +
 13 files changed, 54 insertions(+), 66 deletions(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -6413,8 +6413,7 @@ static int btrfs_mknod(struct inode *dir
goto out_unlock_inode;
} else {
btrfs_update_inode(trans, root, inode);
-   unlock_new_inode(inode);
-   d_instantiate(dentry, inode);
+   d_instantiate_new(dentry, inode);
}
 
 out_unlock:
@@ -6489,8 +6488,7 @@ static int btrfs_create(struct inode *di
goto out_unlock_inode;
 
BTRFS_I(inode)->io_tree.ops = _extent_io_ops;
-   unlock_new_inode(inode);
-   d_instantiate(dentry, inode);
+   d_instantiate_new(dentry, inode);
 
 out_unlock:
btrfs_end_transaction(trans, root);
@@ -6633,12 +6631,7 @@ static int btrfs_mkdir(struct inode *dir
if (err)
goto out_fail_inode;
 
-   d_instantiate(dentry, inode);
-   /*
-* mkdir is special.  We're unlocking after we call d_instantiate
-* to avoid a race with nfsd calling d_instantiate.
-*/
-   unlock_new_inode(inode);
+   d_instantiate_new(dentry, inode);
drop_on_err = 0;
 
 out_fail:
@@ -9789,8 +9782,7 @@ static int btrfs_symlink(struct inode *d
goto out_unlock_inode;
}
 
-   unlock_new_inode(inode);
-   d_instantiate(dentry, inode);
+   d_instantiate_new(dentry, inode);
 
 out_unlock:
btrfs_end_transaction(trans, root);
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1897,6 +1897,28 @@ struct dentry *d_instantiate_unique(stru
 
 EXPORT_SYMBOL(d_instantiate_unique);
 
+/*
+ * This should be equivalent to d_instantiate() + unlock_new_inode(),
+ * with lockdep-related part of unlock_new_inode() done before
+ * anything else.  Use that instead of open-coding d_instantiate()/
+ * unlock_new_inode() combinations.
+ */
+void d_instantiate_new(struct dentry *entry, struct inode *inode)
+{
+   BUG_ON(!hlist_unhashed(>d_u.d_alias));
+   BUG_ON(!inode);
+   lockdep_annotate_inode_mutex_key(inode);
+   security_d_instantiate(entry, inode);
+   spin_lock(>i_lock);
+   __d_instantiate(entry, inode);
+   WARN_ON(!(inode->i_state & I_NEW));
+   inode->i_state &= ~I_NEW;
+   smp_mb();
+   wake_up_bit(>i_state, __I_NEW);
+   spin_unlock(>i_lock);
+}
+EXPORT_SYMBOL(d_instantiate_new);
+
 /**
  * d_instantiate_no_diralias - instantiate a non-aliased dentry
  * @entry: dentry to complete
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -287,8 +287,7 @@ ecryptfs_create(struct inode *directory_
iput(ecryptfs_inode);
goto out;
}
-   unlock_new_inode(ecryptfs_inode);
-   d_instantiate(ecryptfs_dentry, ecryptfs_inode);
+   d_instantiate_new(ecryptfs_dentry, ecryptfs_inode);
 out:
return rc;
 }
--- a/fs/ext2/namei.c
+++ b/fs/ext2/namei.c
@@ -40,8 +40,7 @@ static inline int ext2_add_nondir(struct
 {
int err = ext2_add_link(dentry, inode);
if (!err) {
-   unlock_new_inode(inode);
-   d_instantiate(dentry, inode);
+   d_instantiate_new(dentry, inode);
return 0;
}
inode_dec_link_count(inode);
@@