I'm running OpenBSD 6.0 i386 on a Soekris as my local firewall. We had/have a problem with network dropouts on our NBN satellite connection which I believe I've traced to the firewall's ARP entry for the upstream gateway expiring.

The problem appears to be that once the ARP entry expires, the firewall does not issue an ARP who-was request to renew the entry. As a consequence packets can't be forwarded to the gateway and it looks like an ISP outage. This state persists for periods of up to 10 minutes.

During the "outage" DHCP on the ISP link works (presumably because that doesn't involve the arp table) but pings to the gateway do not, and nor does any other normal IP traffic which requires using the gateway.

I left a tcpdump running for the gateway host IP and noticed this morning that immediately after an ARP request occurred and was answered (immediately) that traffic commenced working again, which led to to pursuing this.

I don't understand why, since the gateway address doesn't have a current ARP entry, the firewall does not imemdiately issue an ARP request for it. Even a ping directly from the firewall to the gateway address does not cause an ARP request.

In case it is relevant, all the through traffic is directed via PF nat-to rules, but I suspect this isn't related because direct ping traffic from the firewall also doesn't work. On the other hand, there's a secondary interface to a 3G modem which doesn't do this, and traffic through that interface is not NATed because the 3G modem does it.

Finally, I've done the following to verify the issue:

Waited for the ARP entry to expire, and saw throughput cease and direct pings of the gateway from the firewall fail:

 ping 172.16.20.254
 PING 172.16.20.254 (172.16.20.254): 56 data bytes
 ping: sendto: Host is down
 ping: wrote 172.16.20.254 64 chars, ret=-1
 ping: sendto: Host is down
 ping: wrote 172.16.20.254 64 chars, ret=-1

I added the ARP entry by hand with the arp command and throughput and pings resumed immediately.

I've manually removed the ARP entry and seem identical symptoms, and I've manually added a static ARP entry for the gateway and the connection has been solid for several hours now. Versus "outages" every hour, if not more frequently.

I would like to understand this behaviour and to know if it is, as it appears, a bug.

Cheers,
Cameron Simpson <c...@cskk.id.au>

Reply via email to