On 31.10.18 г. 9:14 ч., Lu Fengqi wrote:
> On Tue, Oct 30, 2018 at 05:14:42PM -0700, Omar Sandoval wrote:
>> From: Omar Sandoval <osan...@fb.com>
>>
>> There's a race between close_ctree() and cleaner_kthread().
>> close_ctree() sets btrfs_fs_closing(), and the cleaner stops when it
>> sees it set, but this is racy; the cleaner might have already checked
>> the bit and could be cleaning stuff. In particular, if it deletes unused
>> block groups, it will create delayed iputs for the free space cache
>> inodes. As of "btrfs: don't run delayed_iputs in commit", we're no
>> longer running delayed iputs after a commit. Therefore, if the cleaner
>> creates more delayed iputs after delayed iputs are run in
>> btrfs_commit_super(), we will leak inodes on unmount and get a busy
>> inode crash from the VFS.
>>
>> Fix it by parking the cleaner before we actually close anything. Then,
>> any remaining delayed iputs will always be handled in
>> btrfs_commit_super(). This also ensures that the commit in close_ctree()
>> is really the last commit, so we can get rid of the commit in
>> cleaner_kthread().
>>
>> Fixes: 30928e9baac2 ("btrfs: don't run delayed_iputs in commit")
>> Signed-off-by: Omar Sandoval <osan...@fb.com>
>> ---
>> We found this with a stress test that our containers team runs. I'm
>> wondering if this same race could have caused any other issues other
>> than this new iput thing, but I couldn't identify any.
> 
> I noticed an inode leak issue in generic/475, but whether dropping commit
> 30928e9baac2 ("btrfs: don't run delayed_iputs in commit") or applying
> this patch, the issue still exists.
> 
> I have attached the dmesg.

Are you able to trigger this reliably i.e 100% or just, say, 80-90% of
the time? If it's sporadic (but frequent) then it is likely an issue
with error handling. Also looking at the log I see:

[  367.977998] BTRFS info (device dm-3): at unmount delalloc count 8192

Meaning we are leaking 8k of delalloc space which should have been
cleaned up. And the warn is caused because we have an inode in the inode
rb tree. I will suggest that you patch btrfs_free_fs_root to print the
ino's if root->inode_tree is not empty. That way you can see if the inos
correspond to a data file or a freespace ino. But given btrfs free_fs
also complains of delaloc bytes I'd be willing to say it's data inodes.

<snip>

Reply via email to