This can happen with any long lived UDP-based protocol that is natted to a dynamically configured address. I think this is not a bug; PF is doing exactly what you have asked it to.

The issue is that you're sending packets frequently (due to the fairly low keepalive timer) which refreshes the PF state and keeping the NAT mapping alive. Normally this is what you want, it's the whole point of the keepalive, but it falls apart when the address changes.

As you've said, flushing all states does the trick (and it's obvious why); this could possibly be automated via ifstated. It's a bit heavy handed to flush all states when only those involving one natted IP address actually need it but pf doesn't have a more targetted way for states using a certain NAT address, short of parsing pfctl -ss and killing the individual states by id. Alternatively, not a generic solution but might work in your case is to use ifstated to run pfctl -k to delete states from the known address of the rpi to the known wireguard endpoint address. I'm hoping that the change of address would also involve a link state change so that ifstated can trigger on this. However the interface used by pfctl -k is buggy and I can't remember if this use will run into a problem or not...

I don't know if it might make sense to handle automatically in the kernel; it would be convenient for the user but would I think be delicate work, especially regarding MP locking.

--
 Sent from a phone, apologies for poor formatting.
On 19 May 2021 01:16:06 Mikolaj Kucharski <miko...@kucharski.name> wrote:

Forgot to also show PF rules:

pce-0035# grep -ve '^$' -e '^#' /etc/pf.conf
set skip on lo
set limit states 25000
queue q_umb0 on umb0 flows 1024 qlimit 50 quantum 300 default
queue q_athn0 on athn0 flows 1024 qlimit 100 quantum 300 default
match out on umb0 inet from !(umb0) nat-to (umb0:0)
match on umb0 inet all scrub (no-df random-id max-mss 1460)
block return
pass quick proto icmp all
pass quick proto icmp6 all
pass quick on tun0 from (tun0:network) to any keep state (if-bound)
pass in quick proto tcp from any to (self) port ssh
pass in quick proto udp from any to (self) port 51820
pass quick on { athn0 em0 em1 em2 }
pass out


On Wed, May 19, 2021 at 12:10:45AM +0000, miko...@kucharski.name wrote:
Synopsis: wireguard traffic blackholed after umb(4) changes ip addr
Category: kernel
Environment:
System      : OpenBSD 6.9
Details : OpenBSD 6.9-current (GENERIC.MP) #14: Tue May 11 18:41:12 UTC 2021
r...@pc1.home.local:/home/mkucharski/openbsd/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine     : amd64
Description:
This also occurs on vanilla kernel, but at present I'm running
custom kernel, with some athn(4) related changes.

Once in a while umb(4) disconnects from network and reconnects with new
IP address. OpenBSD machine, from this bug report has a wg0 interface
and when umb(4) changes IP address wireguard tunnel from OpenBSD to
external peer klMOiaGJpjMM1bqJouUirOIJRRqcQ8J5QdWOErfj5UM= is NOT
affected:

pce-0035# ifconfig wg0
wg0: flags=80c3<UP,BROADCAST,RUNNING,NOARP,MULTICAST> mtu 1420
index 9 priority 0 llprio 3
wgport 51820
wgpubkey BvWfmzqI94CkkI5TygWcmT10de8+7DUA2cxsl3jPeyo=
wgpeer klMOiaGJpjMM1bqJouUirOIJRRqcQ8J5QdWOErfj5UM=
       wgpsk (present)
       wgpka 25 (sec)
       wgendpoint 5.135.165.132 51820
       tx: 19396164, rx: 2960456
       last handshake: 57 seconds ago
       wgaip fde4:f456:48c2:13c0::/64
groups: wg
inet6 fde4:f456:48c2:13c0::cc35 prefixlen 64

Per above output last handshake is pretty recent. However, over em(1)
device there is RPi connected with Linux, which also has wireguard
tunnel configured to the same endpoint:

rpi-0058:~# wg
interface: wg0
public key: QLG8RSrYJ/MUmIo2NJcgwleAnPFnl843HwNDgcd9u0c=
private key: (hidden)
listening port: 51820

peer: klMOiaGJpjMM1bqJouUirOIJRRqcQ8J5QdWOErfj5UM=
preshared key: (hidden)
endpoint: 5.135.165.132:51820
allowed ips: fde4:f456:48c2:13c0::/64
latest handshake: 1 day, 29 minutes, 14 seconds ago
transfer: 424.79 KiB received, 3.09 MiB sent
persistent keepalive: every 25 seconds

Per above output, we see that latest handshake was more than a day ago.
When I look tcpdump on em(4) which is connected to RPi I see following:

pce-0035# tcpdump -c5 -ni em1 host 5.135.165.132 and port 51820
tcpdump: listening on em1, link-type EN10MB
23:22:03.803721 192.168.1.58.51820 > 5.135.165.132.51820: [wg] initiation from 0xec39d550 [tos 0x88] 23:22:08.923739 192.168.1.58.51820 > 5.135.165.132.51820: [wg] initiation from 0x13821437 [tos 0x88] 23:22:14.043830 192.168.1.58.51820 > 5.135.165.132.51820: [wg] initiation from 0x3fda1931 [tos 0x88] 23:22:19.803752 192.168.1.58.51820 > 5.135.165.132.51820: [wg] initiation from 0x1537a4da [tos 0x88] 23:22:25.083788 192.168.1.58.51820 > 5.135.165.132.51820: [wg] initiation from 0x93343c5f [tos 0x88]

Then when I look at tcpdmump on umb(4) I see traffic from local machine,
which works fine, but also we see above initiation traffic from RPi,
which uses wrong (old) IP address (TWO wrong IP addresses):

pce-0035# tcpdump -c50 -ni umb0 host 5.135.165.132 and port 51820
tcpdump: listening on umb0, link-type LOOP
23:23:30.355499 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 160 to 0x12b4877c nonce 6 23:23:30.355695 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 160 to 0x12b4877c nonce 7 23:23:30.421959 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 7 23:23:30.614535 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 8 23:23:31.360393 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 480 to 0x12b4877c nonce 8 23:23:31.360398 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 256 to 0x12b4877c nonce 9 23:23:31.431873 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 9 23:23:32.369725 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 480 to 0x12b4877c nonce 10 23:23:32.683202 100.102.59.145.52422 > 5.135.165.132.51820: [wg] initiation from 0x3007810a [tos 0x88] 23:23:32.744484 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 10 23:23:33.379929 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 256 to 0x12b4877c nonce 11 23:23:33.380107 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 12 23:23:33.518203 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 11 23:23:34.389827 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 480 to 0x12b4877c nonce 13 23:23:34.753559 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 12 23:23:35.399748 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 14 23:23:35.774192 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 13 23:23:35.903086 100.100.10.26.55426 > 5.135.165.132.51820: [wg] initiation from 0xa30f0817 [tos 0x88] 23:23:36.409833 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 15 23:23:36.409838 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 256 to 0x12b4877c nonce 16 23:23:36.550602 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 14 23:23:37.419936 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 480 to 0x12b4877c nonce 17 23:23:37.712890 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 15 23:23:37.882966 100.102.59.145.52422 > 5.135.165.132.51820: [wg] initiation from 0x293df129 [tos 0x88] 23:23:38.429823 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 480 to 0x12b4877c nonce 18 23:23:38.834033 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 16 23:23:39.439863 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 19 23:23:39.801909 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 17 23:23:40.449874 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 20 23:23:40.833360 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 18 23:23:41.279245 100.100.10.26.55426 > 5.135.165.132.51820: [wg] initiation from 0xe9cf88d8 [tos 0x88] 23:23:41.459919 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 480 to 0x12b4877c nonce 21 23:23:41.794501 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 19 23:23:42.469957 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 22 23:23:42.833630 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 20 23:23:43.082901 100.102.59.145.52422 > 5.135.165.132.51820: [wg] initiation from 0x41dfd760 [tos 0x88] 23:23:43.479931 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 480 to 0x12b4877c nonce 23 23:23:43.873761 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 21 23:23:44.489891 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 24 23:23:44.833901 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 22 23:23:45.500016 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 25 23:23:45.874031 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 23 23:23:46.398752 100.100.10.26.55426 > 5.135.165.132.51820: [wg] initiation from 0x2aaeec29 [tos 0x88] 23:23:46.510138 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 480 to 0x12b4877c nonce 26 23:23:46.782331 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 24 23:23:47.520027 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 27 23:23:47.794006 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 25 23:23:48.530003 100.125.241.79.51820 > 5.135.165.132.51820: [wg] data length 368 to 0x12b4877c nonce 28 23:23:48.762716 100.102.59.145.52422 > 5.135.165.132.51820: [wg] initiation from 0xa0dadeeb [tos 0x88] 23:23:48.904684 5.135.165.132.51820 > 100.125.241.79.51820: [wg] data length 80 to 0xb8f89c82 nonce 26

Above grep for `initiation from`.

See IP addr 100.125.241.79 (working) versus 100.100.10.26 (not working),
100.102.59.145 (another not working) and states in PF:

pce-0035# pfctl -ss | grep -F 5.135.165.132:51820
all udp 5.135.165.132:51820 <- 192.168.1.58:51820       MULTIPLE:MULTIPLE
all udp 100.102.59.145:52422 (192.168.1.58:51820) -> 5.135.165.132:51820 MULTIPLE:MULTIPLE
all udp 5.135.165.132:51820 <- 192.168.5.189:51820       MULTIPLE:MULTIPLE
all udp 100.100.10.26:55426 (192.168.5.189:51820) -> 5.135.165.132:51820 MULTIPLE:MULTIPLE
all udp 100.125.241.79:51820 -> 5.135.165.132:51820       MULTIPLE:MULTIPLE

Current IP address on umb(4) is 100.125.241.79:

pce-0035# ifconfig umb0
umb0: flags=8855<UP,DEBUG,POINTOPOINT,RUNNING,SIMPLEX,MULTICAST> mtu 1500
index 7 priority 6 llprio 3
roaming disabled registration home network
state up cell-class LTE rssi -101dBm speed 47.7Mbps up 286Mbps down
SIM initialized PIN valid (3 attempts left)
subscriber-id 000000000000000 ICC-id 00000000000000000000 provider PLAY
device MC7455 IMEI 000000000000000 firmware SWI9X30C_02.33.03.00
APN internet
dns 89.108.202.20 185.89.185.1
groups: egress
status: active
inet 100.125.241.79 --> 100.125.241.80 netmask 0xffffffe0

I do periodic ifconfig(8) and send it to syslog, so I see IP address
changed twice during uptime of this machine:

# zgrep -F 'inet 100' /var/log/messages...
...
2021-05-17T23:21:01.528Z pce-0035 ifconfig[80523]: inet 100.102.59.145 --> 100.102.59.146 netmask 0xfffffffc 2021-05-17T23:32:47.049Z pce-0035 ifconfig[43011]: inet 100.102.59.145 --> 100.102.59.146 netmask 0xfffffffc 2021-05-17T23:32:47.049Z pce-0035 ifconfig[43011]: inet 100.100.10.26 --> 100.100.10.25 netmask 0xfffffffc 2021-05-18T00:21:01.195Z pce-0035 ifconfig[96823]: inet 100.100.10.26 --> 100.100.10.25 netmask 0xfffffffc
...
2021-05-18T16:21:01.975Z pce-0035 ifconfig[21612]: inet 100.100.10.26 --> 100.100.10.25 netmask 0xfffffffc 2021-05-18T16:38:15.484Z pce-0035 ifconfig[43011]: inet 100.100.10.26 --> 100.100.10.25 netmask 0xfffffffc 2021-05-18T16:38:15.484Z pce-0035 ifconfig[43011]: inet 100.125.241.79 --> 100.125.241.80 netmask 0xffffffe0 2021-05-18T17:21:01.649Z pce-0035 ifconfig[27235]: inet 100.125.241.79 --> 100.125.241.80 netmask 0xffffffe0
...

How-To-Repeat:
Setup NAT with PF, connect wireguard client over internal
network, which goes over external interface which changes IP address
once in a while, in my case it's umb(4).

Fix:
Unknown. Many workarounds, pfctl -Fs, seems the simplest?

After pfctl -Fs, wireguard tunnel works straightaway:

rpi-0058:~# ping6 -n -c5 fde4:f456:48c2:13c0::1
PING fde4:f456:48c2:13c0::1(fde4:f456:48c2:13c0::1) 56 data bytes
64 bytes from fde4:f456:48c2:13c0::1: icmp_seq=2 ttl=64 time=54.3 ms
64 bytes from fde4:f456:48c2:13c0::1: icmp_seq=3 ttl=64 time=58.7 ms
64 bytes from fde4:f456:48c2:13c0::1: icmp_seq=4 ttl=64 time=60.9 ms
64 bytes from fde4:f456:48c2:13c0::1: icmp_seq=5 ttl=64 time=70.7 ms

--- fde4:f456:48c2:13c0::1 ping statistics ---
5 packets transmitted, 4 received, 20% packet loss, time 15ms
rtt min/avg/max/mdev = 54.346/61.151/70.704/5.993 ms


Latest handshake obviously drops to a recent value:

rpi-0058:~# wg
interface: wg0
public key: QLG8RSrYJ/MUmIo2NJcgwleAnPFnl843HwNDgcd9u0c=
private key: (hidden)
listening port: 51820

peer: klMOiaGJpjMM1bqJouUirOIJRRqcQ8J5QdWOErfj5UM=
preshared key: (hidden)
endpoint: 5.135.165.132:51820
allowed ips: fde4:f456:48c2:13c0::/64
latest handshake: 1 minute, 20 seconds ago
transfer: 429.69 KiB received, 3.12 MiB sent
persistent keepalive: every 25 seconds

--
Regards,
Mikolaj

Reply via email to