On Wed, Dec 05, 2018 at 12:12:21PM -0500, Josef Bacik wrote: > From: Josef Bacik <jba...@fb.com> > > With my delayed refs patches in place we started seeing a large amount > of aborts in __btrfs_free_extent > > BTRFS error (device sdb1): unable to find ref byte nr 91947008 parent 0 root > 35964 owner 1 offset 0 > Call Trace: > ? btrfs_merge_delayed_refs+0xaf/0x340 > __btrfs_run_delayed_refs+0x6ea/0xfc0 > ? btrfs_set_path_blocking+0x31/0x60 > btrfs_run_delayed_refs+0xeb/0x180 > btrfs_commit_transaction+0x179/0x7f0 > ? btrfs_check_space_for_delayed_refs+0x30/0x50 > ? should_end_transaction.isra.19+0xe/0x40 > btrfs_drop_snapshot+0x41c/0x7c0 > btrfs_clean_one_deleted_snapshot+0xb5/0xd0 > cleaner_kthread+0xf6/0x120 > kthread+0xf8/0x130 > ? btree_invalidatepage+0x90/0x90 > ? kthread_bind+0x10/0x10 > ret_from_fork+0x35/0x40 > > This was because btrfs_drop_snapshot depends on the root not being modified > while it's dropping the snapshot. It will unlock the root node (and really > every node) as it walks down the tree, only to re-lock it when it needs to do > something. This is a problem because if we modify the tree we could cow a > block > in our path, which free's our reference to that block. Then once we get back > to > that shared block we'll free our reference to it again, and get ENOENT when > trying to lookup our extent reference to that block in __btrfs_free_extent. > > This is ultimately happening because we have delayed items left to be > processed > for our deleted snapshot _after_ all of the inodes are closed for the > snapshot. > We only run the delayed inode item if we're deleting the inode, and even then > we > do not run the delayed insertions or delayed removals. These can be run at > any > point after our final inode does it's last iput, which is what triggers the > snapshot deletion. We can end up with the snapshot deletion happening and > then > have the delayed items run on that file system, resulting in the above > problem. > > This problem has existed forever, however my patches made it much easier to > hit > as I wake up the cleaner much more often to deal with delayed iputs, which > made > us more likely to start the snapshot dropping work before the transaction > commits, which is when the delayed items would generally be run. Before, > generally speaking, we would run the delayed items, commit the transaction, > and > wakeup the cleaner thread to start deleting snapshots, which means we were > less > likely to hit this problem. You could still hit it if you had multiple > snapshots to be deleted and ended up with lots of delayed items, but it was > definitely harder. > > Fix for now by simply running all the delayed items before starting to drop > the > snapshot. We could make this smarter in the future by making the delayed > items > per-root, and then simply drop any delayed items for roots that we are going > to > delete. But for now just a quick and easy solution is the safest. > > Cc: sta...@vger.kernel.org > Signed-off-by: Josef Bacik <jo...@toxicpanda.com> > --- > v1->v2: > - check for errors from btrfs_run_delayed_items. > - Dave I can reroll the series, but the second version of patch 1 is the same, > let me know what you want.
As this is a small update it's fine to send just that patch. You may also use --in-reply-to so it threads to the original series. Resending series makes most sense (to me) when there's a discussion and many changes, so a fresh series makes it clear what's the current status. Patch replaced in for-next topic branch, thanks.