https://bugzilla.mindrot.org/show_bug.cgi?id=3639
--- Comment #26 from JM <jtm.moon.forum.user+mind...@gmail.com> --- tl;dr a seccomp sandbox violation `20` occurs from a `read` (still). This is just a more detailed retelling of what was previously discussed. Scroll to end for thoughts... ### problem specifics Failed `read` in parent process read(fd, s + pos, n - pos); which is read(5, '\014\221b', 4); returns `0`. Failed `read` will cause the following audit event (journalctl -f -x) Dec 17 15:59:35 host1 kernel: audit: type=1326 audit(1702857575.824:3180): auid=4294967295 uid=107 gid=65534 ses=4294967295 pid=1920 comm="sshd" exe="/root/Projects/openssh-9.2p1-WIP/sshd" sig=31 arch=40000028 syscall=20 compat=1 ip=0xf7afd80c code=0x0 And the same when compiled with `CFLAGS="-DDSANDBOX_SECCOMP_FILTER_DEBUG"` Dec 17 22:37:50 pifuboo audit[10678]: SECCOMP auid=4294967295 uid=107 gid=65534 ses=4294967295 pid=10678 comm="sshd" exe="/root/Projects/openssh-9.2p1-WIP/sshd" sig=31 arch=40000028 syscall=20 compat=1 ip=0xf77de80c code=0x0 Dec 17 22:37:50 pifuboo audit[10678]: ANOM_ABEND auid=4294967295 uid=107 gid=65534 ses=4294967295 pid=10678 comm="sshd" exe="/root/Projects/openssh-9.2p1-WIP/sshd" sig=31 res=1 The failed linux system call `20` is `epoll_create1` according to `ausyscall` $ ausyscall 20 epoll_create1 So the `read` at some point calls syscall `20`. See section "Summary Thoughts" about this. Here is the failed `read` call https://github.com/openssh/openssh-portable/blob/V_9_2_P1/atomicio.c#L66 It is always the `read` call with values `fd=5`, `n=4`. `read` returns `0`. `errno` is not changed after `read` returns. The stack just before the failed `read` call is: #1 0x004701f8 in atomicio6 (f=f@entry=0xf7c65478 <read>, fd=fd@entry=5, _s=0xfffeead8, _s@entry=0xfffeead0, n=n@entry=4, cb=cb@entry=0x0, cb_arg=cb_arg@entry=0x0) at atomicio.c:67 #2 0x00470284 in atomicio (f=f@entry=0xf7c65478 <read>, fd=fd@entry=5, _s=_s@entry=0xfffeead0, n=n@entry=4) at atomicio.c:101 #3 0x00434520 in mm_request_receive (sock=5, m=m@entry=0x4f2b88) at monitor_wrap.c:149 #4 0x00431178 in monitor_read (ssh=ssh@entry=0x4f3388, pmonitor=pmonitor@entry=0x4f2498, ent=0x4e0114 <mon_dispatch_proto20>, pent=pent@entry=0xfffeeb78) at monitor.c:501 #5 0x00433b5c in monitor_child_preauth (ssh=ssh@entry=0x4f3388, pmonitor=0x4f2498) at monitor.c:301 #6 0x0040a388 in privsep_preauth (ssh=0x4f3388) at sshd.c:502 #7 main (ac=<optimized out>, av=0x4e31a0) at sshd.c:2240 (line numbers in `atomicio.c` slightly off due to insertion of `raise(SIGINT);`) The debug log from the start of client connection is: debug3: fd 5 is not O_NONBLOCK debug1: Server will not fork when running in debugging mode. debug3: send_rexec_state: entering fd = 8 config len 3296 debug3: ssh_msg_send: type 0 debug3: send_rexec_state: done debug1: rexec start in 5 out 5 newsock 5 pipe -1 sock 8 process 32277 is executing new program: /root/Projects/openssh-9.2p1/sshd [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1". debug3: recv_rexec_state: entering fd = 5 debug3: ssh_msg_recv entering debug3: recv_rexec_state: done debug2: parse_server_config_depth: config rexec len 3296 debug3: rexec:14 setting Port 55222 debug3: rexec:22 setting HostKey /root/Projects/openssh-9.2p1/ssh_host_ecdsa_key debug3: rexec:23 setting HostKey /root/Projects/openssh-9.2p1/ssh_host_ed25519_key debug3: rexec:24 setting HostKey /root/Projects/openssh-9.2p1/ssh_host_rsa_key debug3: rexec:45 setting AuthorizedKeysFile .ssh/authorized_keys debug3: rexec:113 setting Subsystem sftp /usr/libexec/sftp-server debug1: sshd version OpenSSH_9.2, OpenSSL 1.1.1w 11 Sep 2023 debug1: private host key #0: ecdsa-sha2-nistp256 SHA256:7OUDaY7vmsaJPDkqGWPmdiw5kjY4bVSwCd94nJqT7/o debug1: private host key #1: ssh-ed25519 SHA256:CuPO+bnbHMCkaNEybTHeYSjdNpiNdAlntO9gh0V9lxs debug1: private host key #2: ssh-rsa SHA256:ZYZLLhbWdOMFKDGw3pcn954Wz6RhwtDoI5WjJsZpXhk debug1: inetd sockets after dupping: 3, 3 debug3: process_channel_timeouts: setting 0 timeouts debug3: channel_clear_timeouts: clearing Connection from 192.168.124.214 port 57930 on 192.168.124.214 port 55222 rdomain "" debug1: Local version string SSH-2.0-OpenSSH_9.2 debug1: Remote protocol version 2.0, remote software version OpenSSH_8.4p1 Raspbian-5+deb11u2 debug1: compat_banner: match: OpenSSH_8.4p1 Raspbian-5+deb11u2 pat OpenSSH* compat 0x04000000 debug2: fd 3 setting O_NONBLOCK debug3: ssh_sandbox_init: preparing seccomp filter sandbox [Detaching after fork from child process 32308] debug2: Network child is on pid 32308 debug3: preauth child monitor started debug3: privsep user:group 107:65534 [preauth] debug1: permanently_set_uid: 107/65534 [preauth] debug3: ssh_sandbox_child: setting PR_SET_NO_NEW_PRIVS [preauth] debug3: ssh_sandbox_child: attaching seccomp filter program [preauth] debug1: monitor_read_log: child log fd closed debug3: mm_request_receive: entering Dumping `/proc/$parentpid/status` just before the `read` failure shows: Seccomp: 0 Seccomp_filters: 0 Dumping `/proc/$childpid/status` just before the `read` failure shows: Seccomp: 3 Seccomp_filters: 1 File descriptor 5 of the parent process is a STREAM (according to `lsof`) $ lsof -p 11715 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME sshd 11715 root cwd DIR 179,2 4096 2 / sshd 11715 root rtd DIR 179,2 4096 2 / sshd 11715 root txt REG 179,2 1318404 72912 /root/Projects/openssh-9.2p1-WIP/sshd sshd 11715 root mem REG 179,2 42628 2737 /lib/arm-linux-gnueabihf/libnss_files-2.31.so sshd 11715 root mem REG 179,2 116324 4913 /lib/arm-linux-gnueabihf/libgcc_s.so.1 sshd 11715 root mem REG 179,2 137364 2748 /lib/arm-linux-gnueabihf/libpthread-2.31.so sshd 11715 root mem REG 179,2 13864 2668 /lib/arm-linux-gnueabihf/libdl-2.31.so sshd 11715 root mem REG 179,2 1315688 2667 /lib/arm-linux-gnueabihf/libc-2.31.so sshd 11715 root mem REG 179,2 95880 11965 /lib/arm-linux-gnueabihf/libz.so.1.2.11 sshd 11715 root mem REG 179,2 2150824 11138 /usr/lib/arm-linux-gnueabihf/libcrypto.so.1.1 sshd 11715 root mem REG 179,2 9796 2753 /lib/arm-linux-gnueabihf/libutil-2.31.so sshd 11715 root mem REG 179,2 210340 1499 /lib/arm-linux-gnueabihf/libcrypt.so.1.1.0 sshd 11715 root mem REG 179,2 17708 14672 /usr/lib/arm-linux-gnueabihf/libarmmem-v7l.so sshd 11715 root mem REG 179,2 146888 2655 /lib/arm-linux-gnueabihf/ld-2.31.so sshd 11715 root 0u CHR 1,3 0t0 5 /dev/null sshd 11715 root 1u CHR 1,3 0t0 5 /dev/null sshd 11715 root 2u CHR 136,4 0t0 7 /dev/pts/4 sshd 11715 root 3u IPv4 8165255 0t0 TCP localhost:55522->localhost:48024 (ESTABLISHED) sshd 11715 root 5u unix 0x00000000c618f590 0t0 8165266 type=STREAM ### Other miscellaneous observations: * the child process quickly becomes "defunct" Oddly, I can see that a child process is created by debug-printing PIDs at certain points. e.g. a debug log message prints debug2: Network child is on pid 11719 But later, just before the failed `read`, that child process is in a "defunct" state. e.g. command `ps -ef` shows $ ps -ef ... sshd: [accepted] [sshd] <defunct> I suspect the child process is immediately dying and the later parent process `read` then fails. * Years ago on this same system, I locally built 8.4p1, 8.6p1, 9.0p1, that have run just fine. 8.4p1 was built Feb 2021 8.6p1 was built Jun 2022 9.0p1 was built Jun 2022 Yet, I downloaded those same old versions today and they failed. Each hit the same child process abort. * I verified the address of `f` is function `read` (https://github.com/openssh/openssh-portable/blob/V_9_2_P1/atomicio.c#L66) with code snippet: if (f == read) { debug3("read(%d, '%s', %u); (errno=%u)", fd, s + pos, n - pos, errno); } else { debug3("(f=%p) (%d, '%s', %u); (errno=%u)", f, fd, s + pos, n - pos, errno); } > Could you add try adding a similar printf+getpid+exit * verified within the `sshd` process, `__NR_epoll_create1 = 357` and `__NR_getpid = 20` via `debug3` prints, e.g. code debug3("__NR_getpid = %d", __NR_getpid); debug3("__NR_epoll_create1 = %d", __NR_epoll_create1); int _pid = getpid(); debug3("getpid() = %d", _pid); * built with `./configure --with-sandbox=no` and it runs okay (no child process aborts) * other sandboxes failed to compile due to missing headers or kernel capabilities (and I didn't feel like chasing these down) * systrace * rlimit * capsicum * Various `fcntl` `GET` checks of file descriptor 5. `errno` was set to `0` before each call to `fcntl`. fcntl(5, F_GETFD) returned 1 (0x00000001) (errno=0) fcntl(5, F_GETOWN_EX) returned 0 (0x00000000) (errno=0) owner.type=0, owner.pid=0 fcntl(5, F_GETOWN) returned 0 (0x00000000) (errno=0) fcntl(5, F_GETPIPE_SZ) returned -1 (0xffffffff) (errno=9) fcntl(5, F_GET_SEALS) returned -1 (0xffffffff) (errno=22) fcntl(5, F_GETLEASE ) returned 2 (0x00000002) (errno=0) * for posterity, if anyone else can repro this, then manually add this code in `atomicio.c` function `atomicio6` to cause a GDB break: if (fd == 5 && n == 4 && pos == 0 && errno == 32) { raise(SIGINT); } Those are the happenstance values before the failed `read` call. Add the prior snippet just before code: res = (f) (fd, s + pos, n - pos); gdb command: $ gdb --args "$(realpath .)/sshd" -ddddd -f sshd_config ### Summary Thoughts > So perhaps the problem here is that either it's picking up 32bit vs 64bit > headers, or that the binary is some kind of 32bit compatibility mode but the > sandbox is expecting the 64bit syscalls. Looking at the above `grep -r -Ee '__NR_getpid|__NR_epoll_create1'`, maybe the resident compiled libc used header `/usr/include/asm-generic/unistd.h` where `#define __NR_epoll_create1 20`, and openssh-server builds is uses header `/usr/include/arm-linux-gnueabihf/asm/unistd-eabi.h` where `#define __NR_getpid (__NR_SYSCALL_BASE + 20)` (or vice-versa). Looking at the files $ l /usr/include/arm-linux-gnueabihf/asm/unistd-eabi.h -rw-r--r-- 1 root root 19938 Apr 5 2023 /usr/include/arm-linux-gnueabihf/asm/unistd-eabi.h $ l /usr/include/asm-generic/unistd.h -rw-r--r-- 1 root root 31480 Apr 5 2023 /usr/include/asm-generic/unistd.h Looking at the compiled `libc` $ find /usr -name 'libc.so' /usr/lib/arm-linux-gnueabihf/libc.so $ ls -l /usr/lib/arm-linux-gnueabihf/libc.so -rw-r--r-- 1 root root 289 Oct 3 12:55 /usr/lib/arm-linux-gnueabihf/libc.so So maybe my include or library pathing is was reconfigured (messed up) some time in October 2023 (oh man, how screwed am I?) ### in the meantime ... I have a workaround using `--with-sandbox=no`. If you'd like me to try something else then please let me know. Otherwise, I've spent a fair amount of time in this rabbit hole and need to get back to other things (i.e. other rabbit holes 😉). -James Moon (https://github.com/jtmoon79/) (https://twitter.com/jtmoon1979/) -- You are receiving this mail because: You are watching someone on the CC list of the bug. You are watching the assignee of the bug. _______________________________________________ openssh-bugs mailing list openssh-bugs@mindrot.org https://lists.mindrot.org/mailman/listinfo/openssh-bugs