Package: libtirpc3
Version: 1.1.4-0.4
Severity: normal
Tags: patch

Dear Maintainer,

My NIS setup stops working occasionally.
The clients rely on subnet broadcast CALLIT requests to locate
the NIS servers.
The rpcbind process on the NIS server sees the requests but
fails to send the reply.
The strace output looks like this:

recvmsg(6, {msg_name={sa_family=AF_INET, sin_port=htons(800), 
sin_addr=inet_addr("172.27.5.162")}, msg_namelen=128->16, 
msg_iov=[{iov_base="\0\2:\270\0\0\0\0\0\0\0\2\0\1\206\240\0\0\0\2\0\0\0\5\0\0\0\0\0\0\0\0"...,
 iov_len=9000}], msg_iovlen=1, msg_control=[{cmsg_len=28, cmsg_level=SOL_IP, 
cmsg_type=IP_PKTINFO, cmsg_data={ipi_ifindex=if_nametoindex("eth0"), 
ipi_spec_dst=inet_addr("172.27.8.1"), ipi_addr=inet_addr("172.27.63.255")}}], 
msg_controllen=32, msg_flags=0}, 0) = 64
...
sendmsg(7, {msg_name={sa_family=AF_INET, sin_port=htons(800), 
sin_addr=inet_addr("172.27.5.162")}, msg_namelen=16, 
msg_iov=[{iov_base="\0\2:\270\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\2\371\0\0\0\4"...,
 iov_len=36}], msg_iovlen=1, msg_control=[{cmsg_len=28, cmsg_level=SOL_IP, 
cmsg_type=IP_PKTINFO, cmsg_data={ipi_ifindex=0, 
ipi_spec_dst=inet_addr("127.0.0.1"), ipi_addr=inet_addr("127.0.0.1")}}], 
msg_controllen=32, msg_flags=0}, 0) = -1 EINVAL (Invalid argument)

... where I think the active ingredient is that ipi_spec_dst or
ipi_addr is 127.0.0.1 rather than the 172.27.5.162 intended
reply address.
Once an rpcbind process has got into this state, it doesn't
recover without being restarted.
rpcbind is calling svc_sendreply on the xprt it created in
create_rmtcall_fd, which isn't where the request originated.
That calls svc_dg_reply which assumes:

                /* cmsg already set in svc_dg_recv */

... as of:

https://git.linux-nfs.org/?p=steved/libtirpc.git;a=commit;h=74ef3df0236c55185225c62fba34953f2582da72
(Try to ensure datagram replies come from the address requests were sent to.)

That's been zero-initialized, so everything works fine until
a port scan or some such sends a datagram to the same port.
Its IP_PKTINFO gets remembered and used on every subsequent
reply.

I can demonstrate the problem without needing NIS.
First enable remote call support.
On Debian, that can be done with:

sudo tee --append /etc/default/rpcbind <<EOF
OPTIONS="$OPTIONS -r"
EOF
sudo /etc/init.d/rpcbind restart

Then rpcinfo can be used to call the null request of the
portmap program:

rpcinfo -b 100000 4

When everything's working, this returns results for me like:

192.168.1.69.0.111      shuttle
192.168.1.69.0.111      shuttle
fe80::82ee:73ff:fed9:23d5.0.111 shuttle
fe80::ccbf:3f03:7b34:c5a1.0.111 shuttle
192.168.1.69.0.111      shuttle
192.168.1.69.0.111      shuttle
fe80::82ee:73ff:fed9:23d5.0.111 shuttle
fe80::ccbf:3f03:7b34:c5a1.0.111 shuttle

Then find the ephemeral port rpcbind is using for remote calls,
with eg:

sudo lsof -p $(pidof rpcbind)

The last two ports mentioned are the ones for me:

rpcbind 7610 _rpc   10u  IPv4          124008328      0t0       UDP *:55138
rpcbind 7610 _rpc   11u  IPv6          124008329      0t0       UDP *:61854

I saw more convincing results with IPv6, so picking that last
port number, I send it any old UDP request:

ruby -we 'require "socket"; so = UDPSocket.new(Socket::AF_INET6); so.send("", 
0, "::1", 61854)'

Repeating my rpcinfo command, I then see just IPv4:

192.168.1.69.0.111      shuttle
192.168.1.69.0.111      shuttle
192.168.1.69.0.111      shuttle
192.168.1.69.0.111      shuttle

By zeroing the msg_control information after the reply is sent
I can make rpcbind recover after sending just one reply
astray, making the output like this:

192.168.1.69.0.111      shuttle
192.168.1.69.0.111      shuttle
fe80::ccbf:3f03:7b34:c5a1.0.111 shuttle
192.168.1.69.0.111      shuttle
192.168.1.69.0.111      shuttle
fe80::82ee:73ff:fed9:23d5.0.111 shuttle
fe80::ccbf:3f03:7b34:c5a1.0.111 shuttle

That lost reply is distasteful but at least I haven't broken
the intent of the "Try to ensure" patch.


-- System Information:
Debian Release: 10.10
  APT prefers oldstable-proposed-updates-debug
  APT policy: (500, 'oldstable-proposed-updates-debug'), (500, 
'oldstable-debug'), (500, 'oldstable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.19.0-16-amd64 (SMP w/8 CPU cores)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, 
TAINT_UNSIGNED_MODULE
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages libtirpc3:amd64 depends on:
ii  libc6             2.28-10
ii  libcom-err2       1.44.5-1+deb10u3
ii  libgssapi-krb5-2  1.17-3+deb10u2
ii  libk5crypto3      1.17-3+deb10u2
ii  libkrb5-3         1.17-3+deb10u2
ii  libtirpc-common   1.1.4-0.4

libtirpc3:amd64 recommends no packages.

libtirpc3:amd64 suggests no packages.

-- no debconf information
--- src/svc_dg.c.orig   2021-09-19 18:24:32.462610751 -0700
+++ src/svc_dg.c        2021-09-19 18:14:52.271066229 -0700
@@ -278,6 +278,8 @@
                        if (su->su_cache)
                                cache_set(xprt, slen);
                }
+               msg->msg_control = NULL;
+               msg->msg_controllen = 0;
        }
        return (stat);
 }

Reply via email to