https://bugzilla.mindrot.org/show_bug.cgi?id=3639

--- Comment #26 from JM <jtm.moon.forum.user+mind...@gmail.com> ---
tl;dr a seccomp sandbox violation `20` occurs from a `read` (still).
      This is just a more detailed retelling of what was previously
discussed.
      Scroll to end for thoughts...

### problem specifics

Failed `read` in parent process

    read(fd, s + pos, n - pos);

which is

    read(5, '\014\221b', 4);

returns `0`.

Failed `read` will cause the following audit event (journalctl -f -x)

    Dec 17 15:59:35 host1 kernel: audit: type=1326
audit(1702857575.824:3180): auid=4294967295 uid=107 gid=65534
ses=4294967295 pid=1920 comm="sshd"
exe="/root/Projects/openssh-9.2p1-WIP/sshd" sig=31 arch=40000028
syscall=20 compat=1 ip=0xf7afd80c code=0x0

And the same when compiled with
`CFLAGS="-DDSANDBOX_SECCOMP_FILTER_DEBUG"`

    Dec 17 22:37:50 pifuboo audit[10678]: SECCOMP auid=4294967295
uid=107 gid=65534 ses=4294967295 pid=10678 comm="sshd"
exe="/root/Projects/openssh-9.2p1-WIP/sshd" sig=31 arch=40000028
syscall=20 compat=1 ip=0xf77de80c code=0x0
    Dec 17 22:37:50 pifuboo audit[10678]: ANOM_ABEND auid=4294967295
uid=107 gid=65534 ses=4294967295 pid=10678 comm="sshd"
exe="/root/Projects/openssh-9.2p1-WIP/sshd" sig=31 res=1

The failed linux system call `20` is `epoll_create1` according to
`ausyscall`

    $ ausyscall 20
    epoll_create1

So the `read` at some point calls syscall `20`. See section "Summary
Thoughts" about this.

Here is the failed `read` call
https://github.com/openssh/openssh-portable/blob/V_9_2_P1/atomicio.c#L66
It is always the `read` call with values `fd=5`, `n=4`.
`read` returns `0`.
`errno` is not changed after `read` returns.

The stack just before the failed `read` call is:

    #1  0x004701f8 in atomicio6
        (f=f@entry=0xf7c65478 <read>, fd=fd@entry=5, _s=0xfffeead8,
_s@entry=0xfffeead0, n=n@entry=4, cb=cb@entry=0x0,
cb_arg=cb_arg@entry=0x0)
        at atomicio.c:67
    #2  0x00470284 in atomicio
        (f=f@entry=0xf7c65478 <read>, fd=fd@entry=5,
_s=_s@entry=0xfffeead0, n=n@entry=4)
        at atomicio.c:101
    #3  0x00434520 in mm_request_receive
        (sock=5, m=m@entry=0x4f2b88)
        at monitor_wrap.c:149
    #4  0x00431178 in monitor_read
        (ssh=ssh@entry=0x4f3388, pmonitor=pmonitor@entry=0x4f2498,
ent=0x4e0114 <mon_dispatch_proto20>, pent=pent@entry=0xfffeeb78)
        at monitor.c:501
    #5  0x00433b5c in monitor_child_preauth
        (ssh=ssh@entry=0x4f3388, pmonitor=0x4f2498)
        at monitor.c:301
    #6  0x0040a388 in privsep_preauth
        (ssh=0x4f3388)
        at sshd.c:502
    #7  main
        (ac=<optimized out>, av=0x4e31a0)
        at sshd.c:2240

(line numbers in `atomicio.c` slightly off due to insertion of
`raise(SIGINT);`)

The debug log from the start of client connection is:

    debug3: fd 5 is not O_NONBLOCK
    debug1: Server will not fork when running in debugging mode.
    debug3: send_rexec_state: entering fd = 8 config len 3296
    debug3: ssh_msg_send: type 0
    debug3: send_rexec_state: done
    debug1: rexec start in 5 out 5 newsock 5 pipe -1 sock 8
    process 32277 is executing new program:
/root/Projects/openssh-9.2p1/sshd
    [Thread debugging using libthread_db enabled]
    Using host libthread_db library
"/lib/arm-linux-gnueabihf/libthread_db.so.1".
    debug3: recv_rexec_state: entering fd = 5
    debug3: ssh_msg_recv entering
    debug3: recv_rexec_state: done
    debug2: parse_server_config_depth: config rexec len 3296
    debug3: rexec:14 setting Port 55222
    debug3: rexec:22 setting HostKey
/root/Projects/openssh-9.2p1/ssh_host_ecdsa_key
    debug3: rexec:23 setting HostKey
/root/Projects/openssh-9.2p1/ssh_host_ed25519_key
    debug3: rexec:24 setting HostKey
/root/Projects/openssh-9.2p1/ssh_host_rsa_key
    debug3: rexec:45 setting AuthorizedKeysFile .ssh/authorized_keys
    debug3: rexec:113 setting Subsystem sftp       
/usr/libexec/sftp-server
    debug1: sshd version OpenSSH_9.2, OpenSSL 1.1.1w  11 Sep 2023
    debug1: private host key #0: ecdsa-sha2-nistp256
SHA256:7OUDaY7vmsaJPDkqGWPmdiw5kjY4bVSwCd94nJqT7/o
    debug1: private host key #1: ssh-ed25519
SHA256:CuPO+bnbHMCkaNEybTHeYSjdNpiNdAlntO9gh0V9lxs
    debug1: private host key #2: ssh-rsa
SHA256:ZYZLLhbWdOMFKDGw3pcn954Wz6RhwtDoI5WjJsZpXhk
    debug1: inetd sockets after dupping: 3, 3
    debug3: process_channel_timeouts: setting 0 timeouts
    debug3: channel_clear_timeouts: clearing
    Connection from 192.168.124.214 port 57930 on 192.168.124.214 port
55222 rdomain ""
    debug1: Local version string SSH-2.0-OpenSSH_9.2
    debug1: Remote protocol version 2.0, remote software version
OpenSSH_8.4p1 Raspbian-5+deb11u2
    debug1: compat_banner: match: OpenSSH_8.4p1 Raspbian-5+deb11u2 pat
OpenSSH* compat 0x04000000
    debug2: fd 3 setting O_NONBLOCK
    debug3: ssh_sandbox_init: preparing seccomp filter sandbox
    [Detaching after fork from child process 32308]
    debug2: Network child is on pid 32308
    debug3: preauth child monitor started
    debug3: privsep user:group 107:65534 [preauth]
    debug1: permanently_set_uid: 107/65534 [preauth]
    debug3: ssh_sandbox_child: setting PR_SET_NO_NEW_PRIVS [preauth]
    debug3: ssh_sandbox_child: attaching seccomp filter program
[preauth]
    debug1: monitor_read_log: child log fd closed
    debug3: mm_request_receive: entering

Dumping `/proc/$parentpid/status` just before the `read` failure shows:

    Seccomp:        0
    Seccomp_filters:        0

Dumping `/proc/$childpid/status` just before the `read` failure shows:

    Seccomp:        3
    Seccomp_filters:        1

File descriptor 5 of the parent process is a STREAM (according to
`lsof`)

    $ lsof -p 11715
    COMMAND   PID USER   FD   TYPE             DEVICE SIZE/OFF    NODE
NAME
    sshd    11715 root  cwd    DIR              179,2     4096       2
/
    sshd    11715 root  rtd    DIR              179,2     4096       2
/
    sshd    11715 root  txt    REG              179,2  1318404   72912
/root/Projects/openssh-9.2p1-WIP/sshd
    sshd    11715 root  mem    REG              179,2    42628    2737
/lib/arm-linux-gnueabihf/libnss_files-2.31.so
    sshd    11715 root  mem    REG              179,2   116324    4913
/lib/arm-linux-gnueabihf/libgcc_s.so.1
    sshd    11715 root  mem    REG              179,2   137364    2748
/lib/arm-linux-gnueabihf/libpthread-2.31.so
    sshd    11715 root  mem    REG              179,2    13864    2668
/lib/arm-linux-gnueabihf/libdl-2.31.so
    sshd    11715 root  mem    REG              179,2  1315688    2667
/lib/arm-linux-gnueabihf/libc-2.31.so
    sshd    11715 root  mem    REG              179,2    95880   11965
/lib/arm-linux-gnueabihf/libz.so.1.2.11
    sshd    11715 root  mem    REG              179,2  2150824   11138
/usr/lib/arm-linux-gnueabihf/libcrypto.so.1.1
    sshd    11715 root  mem    REG              179,2     9796    2753
/lib/arm-linux-gnueabihf/libutil-2.31.so
    sshd    11715 root  mem    REG              179,2   210340    1499
/lib/arm-linux-gnueabihf/libcrypt.so.1.1.0
    sshd    11715 root  mem    REG              179,2    17708   14672
/usr/lib/arm-linux-gnueabihf/libarmmem-v7l.so
    sshd    11715 root  mem    REG              179,2   146888    2655
/lib/arm-linux-gnueabihf/ld-2.31.so
    sshd    11715 root    0u   CHR                1,3      0t0       5
/dev/null
    sshd    11715 root    1u   CHR                1,3      0t0       5
/dev/null
    sshd    11715 root    2u   CHR              136,4      0t0       7
/dev/pts/4
    sshd    11715 root    3u  IPv4            8165255      0t0     TCP
localhost:55522->localhost:48024 (ESTABLISHED)
    sshd    11715 root    5u  unix 0x00000000c618f590      0t0 8165266
type=STREAM

### Other miscellaneous observations:

* the child process quickly becomes "defunct"

Oddly, I can see that a child process is created by debug-printing PIDs
at certain points. e.g. a debug log message prints

    debug2: Network child is on pid 11719

But later, just before the failed `read`, that child process is in a
"defunct" state. e.g. command `ps -ef` shows

    $ ps -ef
    ...
    sshd: [accepted]
    [sshd] <defunct>

I suspect the child process is immediately dying and the later parent
process `read` then fails.

* Years ago on this same system, I locally built 8.4p1, 8.6p1, 9.0p1,
that have run just fine.
  8.4p1 was built Feb 2021
  8.6p1 was built Jun 2022
  9.0p1 was built Jun 2022

  Yet, I downloaded those same old versions today and they failed. Each
hit the same child process abort.

* I verified the address of `f` is function `read`
(https://github.com/openssh/openssh-portable/blob/V_9_2_P1/atomicio.c#L66)
  with code snippet:

      if (f == read) {
         debug3("read(%d, '%s', %u); (errno=%u)", fd, s + pos, n - pos,
errno);
      } else {
         debug3("(f=%p) (%d, '%s', %u); (errno=%u)", f, fd, s + pos, n
- pos, errno);
      }

> Could you add try adding a similar printf+getpid+exit

* verified within the `sshd` process, `__NR_epoll_create1 = 357` and
`__NR_getpid = 20` via `debug3` prints, e.g. code

    debug3("__NR_getpid = %d", __NR_getpid);
    debug3("__NR_epoll_create1 = %d", __NR_epoll_create1);
    int _pid = getpid();
    debug3("getpid() = %d", _pid);

* built with `./configure --with-sandbox=no` and it runs okay (no child
process aborts)

* other sandboxes failed to compile due to missing headers or kernel
capabilities (and I didn't feel like chasing these down)
  * systrace
  * rlimit
  * capsicum

* Various `fcntl` `GET` checks of file descriptor 5. `errno` was set to
`0` before each call to `fcntl`.

    fcntl(5, F_GETFD) returned 1 (0x00000001) (errno=0)
    fcntl(5, F_GETOWN_EX) returned 0 (0x00000000) (errno=0)
owner.type=0, owner.pid=0
    fcntl(5, F_GETOWN) returned 0 (0x00000000) (errno=0)
    fcntl(5, F_GETPIPE_SZ) returned -1 (0xffffffff) (errno=9)
    fcntl(5, F_GET_SEALS) returned -1 (0xffffffff) (errno=22)
    fcntl(5, F_GETLEASE ) returned 2 (0x00000002) (errno=0)

* for posterity, if anyone else can repro this,
  then manually add this code in `atomicio.c` function `atomicio6` to
cause a GDB break:

    if (fd == 5 && n == 4 && pos == 0 && errno == 32) {
        raise(SIGINT);
    }

  Those are the happenstance values before the failed `read` call.
  Add the prior snippet just before code:

      res = (f) (fd, s + pos, n - pos);

   gdb command:

       $ gdb --args "$(realpath .)/sshd" -ddddd -f sshd_config

### Summary Thoughts

> So perhaps the problem here is that either it's picking up 32bit vs 64bit 
> headers, or that the binary is some kind of 32bit compatibility mode but the 
> sandbox is expecting the 64bit syscalls.

Looking at the above `grep -r -Ee '__NR_getpid|__NR_epoll_create1'`,
maybe the resident compiled libc
used header `/usr/include/asm-generic/unistd.h` where `#define
__NR_epoll_create1 20`,
and openssh-server builds is uses header
`/usr/include/arm-linux-gnueabihf/asm/unistd-eabi.h`
where `#define __NR_getpid (__NR_SYSCALL_BASE + 20)` (or vice-versa).

Looking at the files

    $ l /usr/include/arm-linux-gnueabihf/asm/unistd-eabi.h
    -rw-r--r-- 1 root root 19938 Apr  5  2023
/usr/include/arm-linux-gnueabihf/asm/unistd-eabi.h

    $ l /usr/include/asm-generic/unistd.h
    -rw-r--r-- 1 root root 31480 Apr  5  2023
/usr/include/asm-generic/unistd.h

Looking at the compiled `libc`

    $ find /usr -name 'libc.so'
    /usr/lib/arm-linux-gnueabihf/libc.so

    $ ls -l /usr/lib/arm-linux-gnueabihf/libc.so
    -rw-r--r-- 1 root root 289 Oct  3 12:55
/usr/lib/arm-linux-gnueabihf/libc.so

So maybe my include or library pathing is was reconfigured (messed up)
some time in October 2023 (oh man, how screwed am I?)

### in the meantime ...

I have a workaround using `--with-sandbox=no`.

If you'd like me to try something else then please let me know.
Otherwise, I've spent a fair amount of time in this rabbit hole and
need to get back to other things (i.e. other rabbit holes 😉).


-James Moon
(https://github.com/jtmoon79/)
(https://twitter.com/jtmoon1979/)

-- 
You are receiving this mail because:
You are watching someone on the CC list of the bug.
You are watching the assignee of the bug.
_______________________________________________
openssh-bugs mailing list
openssh-bugs@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-bugs

Reply via email to