Re: How to analyse excessive PF states?
Le Sat, 22 Oct 2016 18:12:37 +0200, Federico Giannici a écrit : > We have a firewall with OpenBSD 6.0 amd64 that handles about 1.5 Gbps > of traffic. > > I noticed that from a few weeks the number of states is increased > from around 250.000 to almost 2 millions (no change in PF config)! > > At the same time the firewall started loosing a few packets (around > 1-2%, with peeks of 4%). Maybe this is due to too many states to > handle? Hard to tell for the number of states but you have some PF congestions, which is bad. Did you try to augment the sysctl net.inet.ip.ifq.maxlen ? In my previous setup that helped a bit against congestion (net.inet.ip.ifq.maxlen=2048). Regards,
Re: How to analyse excessive PF states?
On 2016-10-22, Federico Giannici wrote: > We have a firewall with OpenBSD 6.0 amd64 that handles about 1.5 Gbps of > traffic. > > I noticed that from a few weeks the number of states is increased from > around 250.000 to almost 2 millions (no change in PF config)! > > At the same time the firewall started loosing a few packets (around > 1-2%, with peeks of 4%). Maybe this is due to too many states to handle? > > How can we find what's happening and creates all these states? > How can we analyse almost 2 millions states to find the culprit? > > Here it is the current output of "pfctl -s info": I think I would start by monitoring "tcpdump -nipfsync0 -s9000" (maybe writing to a file and reading on another machine). My first guess would be some udp ddos-related traffic (dns, snmp, sip, ntp) or possibly synflood. Depending on what it is, reducing state timeouts on that traffic might be reasonable.
Re: How to analyse excessive PF states?
On 10/22/16 18:12, Federico Giannici wrote: > We have a firewall with OpenBSD 6.0 amd64 that handles about 1.5 Gbps of > traffic. > > I noticed that from a few weeks the number of states is increased from > around 250.000 to almost 2 millions (no change in PF config)! > > At the same time the firewall started loosing a few packets (around > 1-2%, with peeks of 4%). Maybe this is due to too many states to handle? > > How can we find what's happening and creates all these states? > How can we analyse almost 2 millions states to find the culprit? The exact answers depend a great deal on what monitoring you have in place already. At the very least studying the output of pfctl -ss (massaged via some scriptery if needed) will give some clues. Better if you have something that keeps track of connections and states over time (netflow export via pflow comes to mind). The packet loss could conceivable by a side effect of the number of states going into the territory where timeouts are scaled down (exceeding 60% of state table limit IIRC). - P -- Peter N. M. Hansteen, member of the first RFC 1149 implementation team http://bsdly.blogspot.com/ http://www.bsdly.net/ http://www.nuug.no/ "Remember to set the evil bit on all malicious network traffic" delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.