[pfSense] System stats: HUGE SPIKE, then failed.

Karl Fife Tue, 03 Apr 2018 14:51:07 -0700

There was just now a sudden spike in states, ~100x the normal number,maxing out the system max in just an hour, and causing the system to fail.

With a maxed out state table, of course the system fails to processtraffic. Has anyone seen something like this before, or have any ideaswhat kinds of things would look like this?


Monitoring PNG attached.

For us, on a normal day, the system hovers around 7-15K states. Justbefore noon today, the system suddenly started adding states at a rageof about 9K per minute until the system maxed out (at 800K states injust under an hour and fifteen minutes).

Failure mode analysis was difficult because we couldn't access the WebUIor SSH becasue (of course) the LAN interface couldn't allocate a statefor the connection, so we had to restart (hoping to find something inthe logs. Logs were not helpful because the circular logs were toosmall (subsequently "embiggened" of course), but more to the point, theoffending states wouldn't be logged anyway, so that won't tell what IPor IP's belong to the offending states anyway.


Going forward:

The ~1 hour window in which to do forensics (when/if this happens again)is quite small, so I wonder if there is a way to have growl generate anotification when say, states exceed a certain threshold, so we can atleast pay attention while it's happening. Any tips on notifications?

Probably irrelevant, but this is: pfSense 2.4.2R p1 AMD64 on aSupermicro Rangely/Atom ECC, ZFS


Thanks!
-Karl


_______________________________________________
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold

[pfSense] System stats: HUGE SPIKE, then failed.

Reply via email to