Re: pf state disappearing [ adaptive timeout bug ]

Matthew Grooms Fri, 22 Jan 2016 14:03:12 -0800

On 1/22/2016 3:35 PM, Nick Rogers wrote:

On Thu, Jan 21, 2016 at 11:44 AM, Matthew Grooms <mgro...@shrew.net> wrote:

# pfctl -si
Status: Enabled for 0 days 02:25:41           Debug: Urgent

State Table                          Total             Rate
   current entries                    77759
   searches                       483831701        55352.0/s
   inserts                           825821           94.5/s
   removals                          748060           85.6/s
Counters
   match                           27118754         3102.5/s
   bad-offset                             0            0.0/s
   fragment                               0            0.0/s
   short                                  0            0.0/s
   normalize                              0            0.0/s
   memory                                 0            0.0/s
   bad-timestamp                          0            0.0/s
   congestion                             0            0.0/s
   ip-option                           6655            0.8/s
   proto-cksum                            0            0.0/s
   state-mismatch                         0            0.0/s
   state-insert                           0            0.0/s
   state-limit                            0            0.0/s
   src-limit                              0            0.0/s
   synproxy                               0            0.0/s

# pfctl -st
tcp.first                   120s
tcp.opening                  30s
tcp.established           86400s
tcp.closing                 900s
tcp.finwait                  45s
tcp.closed                   90s
tcp.tsdiff                   30s
udp.first                   600s
udp.single                  600s
udp.multiple                900s
icmp.first                   20s
icmp.error                   10s
other.first                  60s
other.single                 30s
other.multiple               60s
frag                         30s
interval                     10s
adaptive.start            90000 states
adaptive.end             120000 states
src.track                     0s

I think there may be a problem with the code that calculates adaptive
timeout values that is making it way too aggressive. If by default it's
supposed to decrease linearly between %60 and %120 of the state table max,
I shouldn't be loosing TCP connections that are only idle for a few minutes
when the sate table is < %70 full. Unfortunately that appears to be the
case. At most this should have decreased the 86400s timeout by %17 to
72000s for established TCP connections.

That doesn't make sense to me either. Even if the math is off by a factor
of 10 the state should live for about 24 minutes.

I've tested this for a few hours now and all my idle SSH sessions have
been rock solid. If anyone else is scratching their head over a problem
like this, I would suggest disabling the adaptive timeout feature or
increasing it to a much higher value. Maybe one of the pf maintainers can
chime in and shed some light on why this is happening. If not, I'm going to
file a bug report as this certainly feels like one.

Did you go with making adaptive timeout less aggressive or disable it
entirely? I would think that if adaptive timeout is really that broken more
people would notice this problem, especially myself since I have many
servers running a very short tcp.established timeout, but the fact that you
are noticing this kind of weirdness has me concerned about how the adaptive
setting is affecting my environment.

I increased the value to 90K for the 10K limit. Yes, it's concerning.Today I setup a test environment at about 1/10th the connections to seeif I could reproduce the issue on a smaller scale, but had no luck. I'mtrying to find a cmd line test program that will generate enough tcpconnections so I can reproduce it on a similar scale to my productionenvironment. So far I haven't found anything that will do the trick. Imay end up rolling my own. I'll reply back to the list if I can find away to reproduce this.


Thanks again,

-Matthew
_______________________________________________
freebsd-net@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Re: pf state disappearing [ adaptive timeout bug ]

Reply via email to