Re: aufs_do_xino_fwrite() EINTR loop (WAS: aufs_destroy_inode() deadlock?)
OK, and thank you a lot again for working on this! 2015-12-26 12:26 GMT+09:00 <[1]sf...@users.sourceforge.net>: Akihiro Suda: > However, the bug is a regression caused by commit 296291cd ("mm: make > sendfile(2) killable") to the upstream of Linux kernel. > [2]https://github.com/torvalds/linux/commit/296291cd > This produces infinite -EINTR loop in mm/filemap.c:generic_perform_write(). > Perhaps it can also affect filesystems other than AUFS. It is doubtful. Only one possibility which something bad can happen is the process accounting. As you might know, the process accounting writes a file some info when a process exits. And if the filesystem where the accouting file exists uses generic_file_write_iter() as aufs, then the info may not be written. But the process won't hung at all. I am afraid this issue is aufs specific. > So I think it's possible to keep current AUFS, and put a patch to Linux > kernel. I believe aufs should follow linux-4.3 and 4.1.13. This will be my first (or second) work next year. J. R. Okajima References 1. mailto:sf...@users.sourceforge.net 2. https://github.com/torvalds/linux/commit/296291cd --
Re: aufs_do_xino_fwrite() EINTR loop (WAS: aufs_destroy_inode() deadlock?)
Akihiro Suda: > However, the bug is a regression caused by commit 296291cd ("mm: make > sendfile(2) killable") to the upstream of Linux kernel. > https://github.com/torvalds/linux/commit/296291cd > This produces infinite -EINTR loop in mm/filemap.c:generic_perform_write(). > Perhaps it can also affect filesystems other than AUFS. It is doubtful. Only one possibility which something bad can happen is the process accounting. As you might know, the process accounting writes a file some info when a process exits. And if the filesystem where the accouting file exists uses generic_file_write_iter() as aufs, then the info may not be written. But the process won't hung at all. I am afraid this issue is aufs specific. > So I think it's possible to keep current AUFS, and put a patch to Linux > kernel. I believe aufs should follow linux-4.3 and 4.1.13. This will be my first (or second) work next year. J. R. Okajima --
Re: aufs_do_xino_fwrite() EINTR loop (WAS: aufs_destroy_inode() deadlock?)
Yes, I meant that generic_perform_write() simply returns EINTR every time when it is called. So do_xino_fwrite() cannot escape from the loop. Â Â static ssize_t do_xino_fwrite(vfs_writef_t func, struct file *file, void *kbuf, Â Â Â Â Â Â Â Â Â Â Â size_t size, loff_t *pos) Â Â { Â Â .. Â Â Â Â do { Â Â Â Â Â Â Â /* cannot escape from this loop */ Â Â Â Â Â Â err = func(file, buf.u, size, pos); Â Â Â Â } while (err == -EAGAIN || err == -EINTR); Â Â .. Â Â } Sorry for my ambiguity. BTW, I tested your patch in my machine, and it seems working well. If modifying 296291cd is not accepted, your patch will be the best solution. Thanks! 2015-12-25 15:16 GMT+09:00 <[1]sf...@users.sourceforge.net>: Akihiro Suda: > However, the bug is a regression caused by commit 296291cd ("mm: make > sendfile(2) killable") to the upstream of Linux kernel. > [2]https://github.com/torvalds/linux/commit/296291cd > This produces infinite -EINTR loop in mm/filemap.c:generic_perform_write(). > Perhaps it can also affect filesystems other than AUFS. Why infinite? generic_perform_write() simply returns EINTR, doesn't it? J. R. Okajima References 1. mailto:sf...@users.sourceforge.net 2. https://github.com/torvalds/linux/commit/296291cd --
Re: aufs_do_xino_fwrite() EINTR loop (WAS: aufs_destroy_inode() deadlock?)
Akihiro Suda: > However, the bug is a regression caused by commit 296291cd ("mm: make > sendfile(2) killable") to the upstream of Linux kernel. > https://github.com/torvalds/linux/commit/296291cd > This produces infinite -EINTR loop in mm/filemap.c:generic_perform_write(). > Perhaps it can also affect filesystems other than AUFS. Why infinite? generic_perform_write() simply returns EINTR, doesn't it? J. R. Okajima --
Re: aufs_do_xino_fwrite() EINTR loop (WAS: aufs_destroy_inode() deadlock?)
Thank you for writing the patch. I will try it later. However, the bug is a regression caused by commit 296291cd ("mm: make sendfile(2) killable") to the upstream of Linux kernel. [1]https://github.com/torvalds/linux/commit/296291cd This produces infinite -EINTR loop in mm/filemap.c:generic_perform_write(). Perhaps it can also affect filesystems other than AUFS. So I think it's possible to keep current AUFS, and put a patch to Linux kernel. I would like to hear your opinion. I also opened Linux kernel Bugzilla: [2]https://bugzilla.kernel.org/show_bug.cgi?id=109971 2015-12-25 14:39 GMT+09:00 <[3]sf...@users.sourceforge.net>: > > I'll continue to find the source of this EINTR loop. > > Maybe this is a bug of Linux kernel itself (kernel/pid_namespace.c?) rather > > than AUFS, but I'm still not sure. > > Or aufs should support the case of PF_EXITING flag set in > current->flags... Here is my current and UNTESTED solution. When you get boaring during holidays, I'd suggest you to try this patch and enjoy debugging. J. R. Okajima References 1. https://github.com/torvalds/linux/commit/296291cd 2. https://bugzilla.kernel.org/show_bug.cgi?id=109971 3. mailto:sf...@users.sourceforge.net --
Re: aufs_do_xino_fwrite() EINTR loop (WAS: aufs_destroy_inode() deadlock?)
> > I'll continue to find the source of this EINTR loop. > > Maybe this is a bug of Linux kernel itself (kernel/pid_namespace.c?) rather > > than AUFS, but I'm still not sure. > > Or aufs should support the case of PF_EXITING flag set in > current->flags... Here is my current and UNTESTED solution. When you get boaring during holidays, I'd suggest you to try this patch and enjoy debugging. J. R. Okajima a.patch.bz2 Description: BZip2 compressed data --
Re: aufs_do_xino_fwrite() EINTR loop (WAS: aufs_destroy_inode() deadlock?)
Akihiro Suda: > aufs do_xino_fwrite:85:java[1077]: err -4 > aufs au_xino_do_write:439:java[1077]: I/O Error, write failed (-4) Thanks for testing. > I'll continue to find the source of this EINTR loop. > Maybe this is a bug of Linux kernel itself (kernel/pid_namespace.c?) rather > than AUFS, but I'm still not sure. Or aufs should support the case of PF_EXITING flag set in current->flags... I will look closer next year. Have nice holidays J. R. Okajima --
aufs_do_xino_fwrite() EINTR loop (WAS: aufs_destroy_inode() deadlock?)
Hi, Okajima-san, Thanks a lot for looking into this. After applying your patch, I got EINTR like this: Â aufs do_xino_fwrite:85:java[1077]: err -4 Â aufs au_xino_do_write:439:java[1077]: I/O Error, write failed (-4) The patch actually worked as an ad-hoc solution for the bug, as it breaks the loop in do_xino_fwrite(). (it is not a mutex deadlock! sorry for the misunderstanding.) [1]https://github.com/sfjro/aufs4-linux/blob/aufs4.1/fs/aufs/xino.c#L56-L59 I'll continue to find the source of this EINTR loop. Maybe this is a bug of Linux kernel itself (kernel/pid_namespace.c?) rather than AUFS, but I'm still not sure. 2015-12-23 21:25 GMT+09:00 <[2]sf...@users.sourceforge.net>: Hello Akihiro, Akihiro Suda: > Many people are reporting that AUFS hangs up when a Java process exits. > [3]https://github.com/docker/docker/issues/18180 > (Interestingly, the problem seems particular to Java) Thanks for reporting. This is new to me and ML. Here is a debug patch. Please apply, reproduce and post the [4]kern.info log. Note that the patch never solves the problem. This is just to investigate the problem. J. R. Okajima References 1. https://github.com/sfjro/aufs4-linux/blob/aufs4.1/fs/aufs/xino.c#L56-L59 2. mailto:sf...@users.sourceforge.net 3. https://github.com/docker/docker/issues/18180 4. http://kern.info/ --
Re: aufs_destroy_inode() deadlock?
Hello Akihiro, Akihiro Suda: > Many people are reporting that AUFS hangs up when a Java process exits. > https://github.com/docker/docker/issues/18180 > (Interestingly, the problem seems particular to Java) Thanks for reporting. This is new to me and ML. Here is a debug patch. Please apply, reproduce and post the kern.info log. Note that the patch never solves the problem. This is just to investigate the problem. J. R. Okajima a.patch.bz2 Description: BZip2 compressed data --
aufs_destroy_inode() deadlock?
Hello, AUFS users, Many people are reporting that AUFS hangs up when a Java process exits. [1]https://github.com/docker/docker/issues/18180 (Interestingly, the problem seems particular to Java) I suspect this is a deadlock related to i_mutex operations in aufs_destroy_inode(), but lockdep did not detect any deadlock. Anyone please look into this? $ ps -eLf root    1054  822  1054 11   3 07:30 ?     00:00:00 [java] root    1054  822  1058 83   3 07:30 ?     00:00:01 [java] root    1054  822  1065  0   3 07:30 ?     00:00:00 [java] $ cat /proc/1054/task/1054/stack [] do_exit+0x8e8/0x92c [] do_group_exit+0x47/0xc4 [] __wake_up_parent+0x0/0x23 [] system_call_fastpath+0x12/0x76 [] 0x $ cat /proc/1054/task/1058/stack [] generic_file_write_iter+0x33/0xc1 [] new_sync_write+0x5b/0x7a [] do_xino_fwrite+0x54/0x83 [] xino_fwrite.part.29+0x3a/0x49 [] xino_fwrite+0x29/0x6c [] au_xino_do_write+0xa0/0xf7 [] au_xino_delete_inode+0x146/0x199 [] au_iinfo_fin+0xec/0x19d [] aufs_destroy_inode+0xe/0x25 [] destroy_inode+0x36/0x4f [] evict+0x148/0x150 [] iput+0x177/0x1af [] __dentry_kill+0x128/0x198 [] dput+0x1c8/0x1f0 [] __fput+0x181/0x198 [] fput+0x9/0xb [] task_work_run+0x85/0x9c [] do_exit+0x425/0x92c [] do_group_exit+0x0/0xc4 [] system_call_fastpath+0x12/0x76 [] 0x $ cat /proc/1054/task/1065/stack [] zap_pid_ns_processes+0x12e/0x175 [] do_exit+0x4e6/0x92c [] do_group_exit+0x47/0xc4 [] get_signal+0x590/0x5bb [] do_signal+0x23/0x599 [] do_notify_resume+0x12/0x4e [] int_signal+0x12/0x17 [] 0x $ dmesg INFO: task java:1065 blocked for more than 120 seconds.    Tainted: G      O   4.1.13-boot2docker #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. java       D 880037b7bc68   0  1065  1033 0x  880037b7bc68 0002 880037b7c000 88916008  880037852110 880037b7bd28 88916008 880037b7bc88  825d02dd 00010920 0002 880037b7bcc8 Call Trace:  [] schedule+0x6f/0x7e  [] zap_pid_ns_processes+0x12e/0x175  [] do_exit+0x4e6/0x92c  [] ? lock_release_holdtime.part.28+0x72/0x79  [] do_group_exit+0x47/0xc4  [] get_signal+0x590/0x5bb  [] do_signal+0x23/0x599  [] ? local_clock+0x19/0x22  [] ? lock_release_holdtime.part.28+0x72/0x79  [] ? int_very_careful+0x5/0x46  [] ? trace_hardirqs_on_caller+0x183/0x19f  [] do_notify_resume+0x12/0x4e  [] int_signal+0x12/0x17 no locks held by java/1065. I also attached `proc-lockdep.tbz` that contains /proc/{locks,lockdep_chains,lockdep_stats,lock_stat,lockdep}. === Reproducibility ===  * almost 100% when single CPU is assigned to the process.  * less than 1% when multiple CPUs are assigned. === How to Reproduce ===  1. Build a VM with this Boot2Docker ISO (kernel 4.1.13 + aufs4.1 + lockdep): [2]https://github.com/AkihiroSuda/boot2docker/releases/tag/1.9.1-lockdep  2. Run `docker run -it --rm java taskset 0x1 java`. Regards, Akihiro Suda References 1. https://github.com/docker/docker/issues/18180 2. https://github.com/AkihiroSuda/boot2docker/releases/tag/1.9.1-lockdep proc-lockdep.tbz Description: Binary data --