On Thu, Sep 26, 2019 at 02:13:44PM +0300, Nikolay Borisov wrote: > > > On 26.09.19 г. 13:54 ч., Josef Bacik wrote: > > We hit the following warning while running down a different problem > > > > [ 6197.175850] ------------[ cut here ]------------ > > [ 6197.185082] refcount_t: underflow; use-after-free. > > [ 6197.194704] WARNING: CPU: 47 PID: 966 at lib/refcount.c:190 > > refcount_sub_and_test_checked+0x53/0x60 > > [ 6197.521792] Call Trace: > > [ 6197.526687] __btrfs_release_delayed_node+0x76/0x1c0 > > [ 6197.536615] btrfs_kill_all_delayed_nodes+0xec/0x130 > > [ 6197.546532] ? __btrfs_btree_balance_dirty+0x60/0x60 > > [ 6197.556482] btrfs_clean_one_deleted_snapshot+0x71/0xd0 > > [ 6197.566910] cleaner_kthread+0xfa/0x120 > > [ 6197.574573] kthread+0x111/0x130 > > [ 6197.581022] ? kthread_create_on_node+0x60/0x60 > > [ 6197.590086] ret_from_fork+0x1f/0x30 > > [ 6197.597228] ---[ end trace 424bb7ae00509f56 ]--- > > > > This is because we're unconditionally grabbing a ref to every node, but > > there could be nodes with a 0 refcount. Fix this to instead use > > refcount_inc_not_zero() and only process the list for the nodes we get a > > refcount on. > > > I'd also add the detail that __btrfs_release_delayed_node actually does > the refcount_dec_and_test outside of &root->inode_lock which allows this > scenario. > > And this bug seems rather old, ever since : > > 16cdcec736cd ("btrfs: implement delayed inode items operation") > > > > > > Signed-off-by: Josef Bacik <jo...@toxicpanda.com> > > --- > > fs/btrfs/delayed-inode.c | 15 +++++++++------ > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c > > index 1f7f39b10bd0..320503750896 100644 > > --- a/fs/btrfs/delayed-inode.c > > +++ b/fs/btrfs/delayed-inode.c > > @@ -1936,7 +1936,7 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root > > *root) > > { > > u64 inode_id = 0; > > struct btrfs_delayed_node *delayed_nodes[8]; > > - int i, n; > > + int i, n, count; > > > > while (1) { > > spin_lock(&root->inode_lock); > > @@ -1948,13 +1948,16 @@ void btrfs_kill_all_delayed_nodes(struct btrfs_root > > *root) > > break; > > } > > > > - inode_id = delayed_nodes[n - 1]->inode_id + 1; > > - > > - for (i = 0; i < n; i++) > > - refcount_inc(&delayed_nodes[i]->refs); > > + count = 0; > > + for (i = 0; i < n; i++) { > > + if (!refcount_inc_not_zero(&delayed_nodes[i]->refs)) > > + break; > > + count++; > > This is buggy, if the very first inode in the gang causes the break > statement then the code does delayed_nodes[0 - 1]->inode_id. E.g. the > increment should be before the refcount_inc_not_zero. >
Sigh, no patches from me before breakfast. I'll fix this up after I eat. Thanks, Josef