high load and problems with kern.maxclusters

Per-Olov Sjöholm Tue, 18 Apr 2006 03:18:44 -0700

Hi

Posted this a week ago on misc with no success. So now I post to "pf" instead 
with some additional info...



#Setup:#
A redundant firewall pair (two HP DL380G4, ciss mirror) with 3 em dual gig 
nics (plus 2 unused bge), 6 vlans, pfsync and 1500 rows of pf.conf. OpenBSD 
3.8 STABLE (updated three weeks ago). The generic kernel is used + backported 
SACK patch so we could use "synproxy" correctly.

#Problem:#
This redundant firewall pair just died after a couple of weeks good work. All 
interfaces use carp. During the last 24 hours before the problem they have 
had a constant 25-30% higher  average load of outgoing traffic 100 to 110 
Mbit, and incoming traffic of 80-90 Mbit. A pfstat graph show a packet rate 
that is not over 15000 in any direction.

Apr 11 09:32:16 XXXXXX /bsd: WARNING: mclpool limit reached; increase 
kern.maxclusters

On the list we have seen people raised kern.maxclusters values to over 65000 
without success (the fw just lasts longer) and later got info that they had a 
driver bug (xl for example). I unfortunately don't have a "netstat-m" or 
"vmstat -m|grep mcl" before the crash but assume I would not be happy to see 
the result of the output. We have now however raised this value to 65000 and 
have not seen this happening again. But we don't know if it will come back 
when the load goes up. We have also added some vmstat and netstat stuff below 
my autosignature with info *after* this value was raised to 65000. It seems 
like the peak value of "netstat -m" just grows. Could this be a bug?


#Question:#
This problem is *hopefully* caused by a high network load and therefor only 
needs tuning rather than an os problem. A sysctl -a | grep kern.maxclusters 
shows the default:
kern.maxclusters=6144
(this was before we raised it to 65000)
What is a reasonable value for kern.maxclusters in a situation like this?
(We ask as we don't want to raise it to high as we also are afraid of eventual 
side effects.)


Additional info....
When the servers died, the load peak described above was already over (see the 
link below). Any good reason why they died when the load was back at standard 
load? see http://www.flowsystems.se/~sjoholmp/pfstat.jpg





Thanks in advance
Per-Olov Sjöholm
-- 
GPG keyID: 4DB283CE
GPG fingerprint: 45E8 3D0E DE05 B714 D549 45BC CFB4 BBE9 4DB2 83CE



"
<snipp>
Thu Apr 13 16:35:01 CEST 2006
1839 mbufs in use:
        1771 mbufs allocated to data
        62 mbufs allocated to packet headers
        6 mbufs allocated to socket names and addresses
1769/1800/65536 mbuf clusters in use (current/peak/max)
4072 Kbytes allocated to network (98% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines


Thu Apr 13 16:40:01 CEST 2006
2495 mbufs in use:
        2360 mbufs allocated to data
        132 mbufs allocated to packet headers
        3 mbufs allocated to socket names and addresses
2359/10866/65536 mbuf clusters in use (current/peak/max)
23540 Kbytes allocated to network (22% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines
<snipp>
Sun Apr 16 19:45:01 CEST 2006
1349 mbufs in use:
        1289 mbufs allocated to data
        57 mbufs allocated to packet headers
        3 mbufs allocated to socket names and addresses
1288/10866/65536 mbuf clusters in use (current/peak/max)
22104 Kbytes allocated to network (13% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines


Sun Apr 16 19:50:01 CEST 2006
24199 mbufs in use:
        24072 mbufs allocated to data
        123 mbufs allocated to packet headers
        4 mbufs allocated to socket names and addresses
24071/26176/65536 mbuf clusters in use (current/peak/max)
58688 Kbytes allocated to network (20% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines


Sun Apr 16 19:55:01 CEST 2006
1397 mbufs in use:
        1300 mbufs allocated to data
        93 mbufs allocated to packet headers
        4 mbufs allocated to socket names and addresses
1302/28222/65536 mbuf clusters in use (current/peak/max)
56936 Kbytes allocated to network (5% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines
<snipp>
"


vmstat:
"
Thu Apr 13 13:05:01 CEST 2006
Name        Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg 
Idle
mclpl       2048 169393794   0 169392503  807     0   807   807     4 32768  
159
<snipp>
Tue Apr 18 09:55:01 CEST 2006
Name        Size Requests Fail Releases Pgreq Pgrel Npage Hiwat Minpg Maxpg 
Idle
mclpl       2048 2757353733  0 2757352434 14111   0 14111 14111     4 32768 
13459
"

high load and problems with kern.maxclusters

Reply via email to