Package: openssh-client Version: 1:7.3p1-3+b1 Severity: normal Dear Maintainer,
I think that the bug https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=714526 is still live, after upgrading to 1:7.3p1-3+b1 (should have been fixed still 1:7.2p2-6). I straced SSH with and without `ControlMaster` and extracted revelant bits of information, redacting it, and removing useless lines. First, without `ControlMaster`: ``` # Here, Python 3 Popen prepares pipes for stdin, stdout, and stderr: 7880 1479228055.059186 pipe2([8, 9], O_CLOEXEC) = 0 7880 1479228055.059226 pipe2([10, 11], O_CLOEXEC) = 0 7880 1479228055.059258 pipe2([12, 13], O_CLOEXEC) = 0 # Then ssh is cloned: 7880 1479228055.060158 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f7bfd7fa9d0) = 7949 7949 1479228055.121259 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fd3f2a63650) = 7954 7954 1479228055.121135 execve("/bin/sh", ["/bin/sh", "-c", "exec ssh user@gate.redacted -W dev2.redacted:22"], [/* 16 vars */] <unfinished ...> # Normal exit 7949 1479228055.592488 +++ exited with 0 +++ 7871 1479228055.592519 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=7949, si_uid=1003, si_status=0, si_utime=0, si_stime=2} --- 7954 1479228055.592786 exit_group(129) = ? # Python receive EOF on stdout AND stderr 7880 1479228055.694455 read(12, <unfinished ...> 7880 1479228055.694480 <... read resumed> "", 32768) = 0 7880 1479228055.738380 read(10, <unfinished ...> 7880 1479228055.738410 <... read resumed> "", 32768) = 0 # So, as the process is exited, AND stdout/stderr had their EOF, Python 3 subprocess.Popen.communicate allows itself to wait for it: 7880 1479228055.924898 wait4(7949, <unfinished ...> 7880 1479228055.925334 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 7949 ``` Now, with `ControlMaster`: ``` # Same pipes for stdin, stdout, and stderr from Python Popen: 6749 1479227922.771783 pipe2([8, 9], O_CLOEXEC) = 0 6749 1479227922.771831 pipe2([10, 11], O_CLOEXEC) = 0 6749 1479227922.771865 pipe2([12, 13], O_CLOEXEC) = 0 # A bunch of clones, SSH starting its master process: 6749 1479227922.774839 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fa1957fa9d0) = 6824 6824 1479227922.818219 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6827 6827 1479227922.820940 execve("/bin/sh", ["/bin/sh", "-c", "exec ssh user@redacted -W redacted:22"], [/* 16 vars */] <unfinished ...> 6824 1479227923.204918 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6856 6856 1479227923.205441 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6857 6857 1479227923.206004 <... clone resumed> child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7ff4c6f11650) = 6858 # Normal exit after work is done: 6824 1479227923.335999 +++ exited with 0 +++ 6749 1479227923.336042 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6824, si_uid=1003, si_status=0, si_utime=0, si_stime=0} --- # Python is getting EOF on STDOUT 6749 1479227923.336086 read(10, "", 32768) = 0 # But here, process is a zombie (dead but not awaited), after like a minute, # I send a SIGINT and a SIGQUIT to the Python process, so everyone dies and finally: 6749 1479227986.358448 read(12, "", 32768) = 0 6827 1479227986.328027 exit_group(130) = ? 6858 1479227986.329254 exit_group(255) = ? 6749 1479227986.578489 wait4(6824, <unfinished ...> 6749 1479227986.578546 <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 6824 ``` So here it looks like the bug is exactly the same as before: stdout is closed but stderr is kept open. As Python's ``subprocess.Popen.communicate`` want to wait for both to be closed before waiting for the process, it yields to stuck Python process and zombies. -- System Information: Debian Release: 8.6 APT prefers stable APT policy: (900, 'stable'), (200, 'testing') Architecture: amd64 (x86_64) Kernel: Linux 3.16.0-4-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL set to en_US.utf8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) Versions of packages openssh-client depends on: ii adduser 3.113+nmu3 ii dpkg 1.17.27 ii libc6 2.24-5 ii libedit2 3.1-20140620-2 ii libgssapi-krb5-2 1.14.3+dfsg-2 ii libselinux1 2.3-2 ii libssl1.0.2 1.0.2j-1 ii passwd 1:4.2-3+deb8u1 ii zlib1g 1:1.2.8.dfsg-2+b1 Versions of packages openssh-client recommends: ii xauth 1:1.0.9-1 Versions of packages openssh-client suggests: pn keychain <none> pn libpam-ssh <none> pn monkeysphere <none> pn ssh-askpass <none> -- no debconf information