There was just now a sudden spike in states, ~100x the normal number, maxing out the system max in just an hour, and causing the system to fail.

With a maxed out state table, of course the system fails to process traffic.  Has anyone seen something like this before, or have any ideas what kinds of things would look like this?

Monitoring PNG attached.

For us, on a normal day, the system hovers around 7-15K states. Just before noon today, the system suddenly started adding states at a rage of about 9K per minute until the system maxed out (at 800K states in just under an hour and fifteen minutes).

Failure mode analysis was difficult because we couldn't access the WebUI or SSH becasue (of course) the LAN interface couldn't allocate a state for the connection, so we had to restart (hoping to find something in the logs.  Logs were not helpful because the circular logs were too small (subsequently "embiggened" of course), but more to the point, the offending states wouldn't be logged anyway, so that won't tell what IP or IP's belong to the offending states anyway.

Going forward:

The ~1 hour window in which to do forensics (when/if this happens again) is quite small, so I wonder if there is a way to have growl generate a notification when say, states exceed a certain threshold, so we can at least pay attention while it's happening.  Any tips on notifications?

Probably irrelevant, but this is: pfSense 2.4.2R p1 AMD64 on a Supermicro Rangely/Atom ECC, ZFS

Thanks!
-Karl


_______________________________________________
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold

Reply via email to