Hello Eddie,

Eddie Horng:
> I got hung to access to an aufs mount dir, dmesg shows mutex_lock call
> stack in aufs.
        :::
> kernel version: 4.10.0-34-generic
> aufs version: 4.x-rcN-20170206

Hmm, your version is rather old, almost a year ago. But I don't think it
important.

> dmesg log:
        :::
> [Tue Jan 16 16:02:22 2018] INFO: task ninja:18218 blocked for more than 120
> seconds.
        :::
> [Tue Jan 16 16:02:22 2018]  mutex_lock+0x2f/0x40
> [Tue Jan 16 16:02:22 2018]  au_xino_delete_inode+0x18c/0x1e0 [aufs]
> [Tue Jan 16 16:02:22 2018]  au_iinfo_fin+0x163/0x1d0 [aufs]
> [Tue Jan 16 16:02:22 2018]  aufs_destroy_inode+0x47/0x50 [aufs]
> [Tue Jan 16 16:02:22 2018]  destroy_inode+0x3b/0x60
> [Tue Jan 16 16:02:22 2018]  evict+0x136/0x1a0
> [Tue Jan 16 16:02:22 2018]  iput+0x1b2/0x230
> [Tue Jan 16 16:02:22 2018]  do_unlinkat+0x12c/0x320
> [Tue Jan 16 16:02:22 2018]  SyS_unlink+0x16/0x20

So you have several proccesses which are all blocked in
+ au_iinfo_fin
  + au_xino_delete_inode
    + mutex_lock

Hmm, currently, I don't know what is wrong at all.

Explicitly au_xino_delete_inode() acquires an aufs internal mutex lock
to maintain the internal bitmap.
Also the function implicitiy acquires a mutex lock for the XINO file,
which should be located on your first writable branch /vol/upper.
These two mutexes are the major candidates I guess.

I am not sure which mutex lock causes the problem, and I'd suggest you
to try these.
- first, just for the confirmation, "cat /sys/fs/aufs/si_*/xi_path" and
  it should print /vol/upper/.aufs.xino.
- apply lockdep-debug.patch and enable CONFIG_LOCKDEP.
- rebuild your kernel and aufs module.
- reboot and test.
- when the problem happens, you can see the acquired locks in dmesg. So
  we can identify which mutex lock is problematic.


J. R. Okajima



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot

Reply via email to