> On 13 Jan 2026, at 11:33, Stuart Henderson <stu.lists spacehopper ! org> 
> wrote:
> 
> Probably not the correct solution, this smells like a bug (mbuf leak).
> Bumping the max limit would keep it running for longer when it hits the
> problem but depending on how fast the leak is, it might not be much
> longer.

Thanks, that is useful, I was a bit worried that modifying the sysctl
would only mask the problem (if anything).

> Also: what interface types do you have on the system? (ifconfig |
> grep ^[a-z] - I know you included dmesg but that doesn't show anything
> like wg, gif, etc if you're using those). Do you use ipsec?

I have 4 physical interfaces, one is connected via PPPoE to the WAN,
the other 3 are the LAN bridged together with veb. I have a WireGuard
interface as well but no ipsec.

# ifconfig | grep ^[a-z]
lo0: flags=2008049<UP,LOOPBACK,RUNNING,MULTICAST,LRO> mtu 32768
igc0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1508
igc1: flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 
1500
aq0: flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 
1500
aq1: flags=8b43<UP,BROADCAST,RUNNING,PROMISC,ALLMULTI,SIMPLEX,MULTICAST> mtu 
1500
enc0: flags=0<>
pppoe0: flags=8851<UP,POINTOPOINT,RUNNING,SIMPLEX,MULTICAST> mtu 1500
veb0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST>
vport0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
wg0: flags=80c3<UP,BROADCAST,RUNNING,NOARP,MULTICAST> mtu 1420
pflog0: flags=141<UP,RUNNING,PROMISC> mtu 33136

> If you manage to catch it while the connection is down but before the
> machine has locked up, does it help to do 'ifconfig igc0 down; ifconfig
> igc0 up’?

I have tried to do this but most of the failures happen overnight,
the few times it has happened whilst I have been nearby it is locked
up within the couple of minutes that it takes to detect the issue
and connect to the machine. I have thought about trying to use
ifstated(8) to do this but as with the sysctl my worry is about
merely masking the actual problem.

> I'd monitor this over time and see if the rise is sudden or gradual.
> And whether you can correlate with log entries. Is it happening after
> the LCP timeout? Is the LCP timeout happening after this has already
> risen?
> 
> e.g.
> 
> $ while true; do netstat -m | grep mbuf.2048; sleep 5; done | ts
> 
> Capture of the output of 'systat mbuf' might also give a clue (it updates
> frequently, you could leave it running in ssh).

I have set up some capture around this and I will report back once
it fails again and hopefully that will give some additional hints.

Reply via email to