On Fri, Dec 07, 2018 at 06:51:21AM +0800, Qu Wenruo wrote: > > > On 2018/12/7 上午3:35, David Sterba wrote: > > On Mon, Nov 12, 2018 at 10:33:33PM +0100, David Sterba wrote: > >> On Thu, Nov 08, 2018 at 01:49:12PM +0800, Qu Wenruo wrote: > >>> This patchset can be fetched from github: > >>> https://github.com/adam900710/linux/tree/qgroup_delayed_subtree_rebased > >>> > >>> Which is based on v4.20-rc1. > >> > >> Thanks, I'll add it to for-next soon. > > > > The branch was there for some time but not for at least a week (my > > mistake I did not notice in time). I've rebased it on top of recent > > misc-next, but without the delayed refs patchset from Josef. > > > > At the moment I'm considering it for merge to 4.21, there's still some > > time to pull it out in case it shows up to be too problematic. I'm > > mostly worried about the unknown interactions with the enospc updates or > > For that part, I don't think it would have some obvious problem for > enospc updates. > > As the user-noticeable effect is the delay of reloc tree deletion. > > Despite that, it's mostly transparent to extent allocation. > > > generally because of lack of qgroup and reloc code reviews. > > That's the biggest problem. > > However most of the current qgroup + balance optimization is done inside > qgroup code (to skip certain qgroup record), if we're going to hit some > problem then this patchset would have the highest possibility to hit > problem. > > Later patches will just keep tweaking qgroup to without affecting any > other parts mostly. > > So I'm fine if you decide to pull it out for now.
I've adapted a stress tests that unpacks a large tarball, snaphosts every 20 seconds, deletes a random snapshot every 50 seconds, deletes file from the original subvolume, now enhanced with qgroups just for the new snapshots inherigin the toplevel subvolume. Lockup. It gets stuck in a snapshot call with the follwin stacktrace [<0>] btrfs_tree_read_lock+0xf3/0x150 [btrfs] [<0>] btrfs_qgroup_trace_subtree+0x280/0x7b0 [btrfs] [<0>] do_walk_down+0x681/0xb20 [btrfs] [<0>] walk_down_tree+0xf5/0x1c0 [btrfs] [<0>] btrfs_drop_snapshot+0x43b/0xb60 [btrfs] [<0>] btrfs_clean_one_deleted_snapshot+0xc1/0x120 [btrfs] [<0>] cleaner_kthread+0xf8/0x170 [btrfs] [<0>] kthread+0x121/0x140 [<0>] ret_from_fork+0x27/0x50 and that's like 10th snapshot and ~3rd deltion. This is qgroup show: qgroupid rfer excl parent -------- ---- ---- ------ 0/5 865.27MiB 1.66MiB --- 0/257 0.00B 0.00B --- 0/259 0.00B 0.00B --- 0/260 806.58MiB 637.25MiB --- 0/262 0.00B 0.00B --- 0/263 0.00B 0.00B --- 0/264 0.00B 0.00B --- 0/265 0.00B 0.00B --- 0/266 0.00B 0.00B --- 0/267 0.00B 0.00B --- 0/268 0.00B 0.00B --- 0/269 0.00B 0.00B --- 0/270 989.04MiB 1.22MiB --- 0/271 0.00B 0.00B --- 0/272 922.25MiB 416.00KiB --- 0/273 931.02MiB 1.50MiB --- 0/274 910.94MiB 1.52MiB --- 1/1 1.64GiB 1.64GiB 0/5,0/257,0/259,0/260,0/262,0/263,0/264,0/265,0/266,0/267,0/268,0/269,0/270,0/271,0/272,0/273,0/274 No IO or cpu activity at this point, the stacktrace and show output remains the same. So, considering this, I'm not going to add the patchset to 4.21 but will keep it in for-next for testing, any fixups or updates will be applied.