I think you're right Stuart, raising kern.maxclusters is only buying me time.
The only sysctl values I've modified are: net.inet.ip.forwarding=1 ddb.panic=0 kern.maxclusters=8192 netstat -m shows increasing values over time, here's the output from this morning: 3510 mbufs in use: 3479 mbufs allocated to data 24 mbufs allocated to packet headers 7 mbufs allocated to socket names and addresses 3477/3522/8192 mbuf 2048 byte clusters in use (current/peak/max) 0/8/8192 mbuf 4096 byte clusters in use (current/peak/max) 0/8/8192 mbuf 8192 byte clusters in use (current/peak/max) 0/8/8192 mbuf 9216 byte clusters in use (current/peak/max) 0/8/8192 mbuf 12288 byte clusters in use (current/peak/max) 0/8/8192 mbuf 16384 byte clusters in use (current/peak/max) 0/8/8192 mbuf 65536 byte clusters in use (current/peak/max) 8204 Kbytes allocated to network (95% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines ...and here it is from this evening: 3718 mbufs in use: 3687 mbufs allocated to data 24 mbufs allocated to packet headers 7 mbufs allocated to socket names and addresses 3685/3734/8192 mbuf 2048 byte clusters in use (current/peak/max) 0/8/8192 mbuf 4096 byte clusters in use (current/peak/max) 0/8/8192 mbuf 8192 byte clusters in use (current/peak/max) 0/8/8192 mbuf 9216 byte clusters in use (current/peak/max) 0/8/8192 mbuf 12288 byte clusters in use (current/peak/max) 0/8/8192 mbuf 16384 byte clusters in use (current/peak/max) 0/8/8192 mbuf 65536 byte clusters in use (current/peak/max) 8628 Kbytes allocated to network (96% in use) 0 requests for memory denied 0 requests for memory delayed 0 calls to protocol drain routines Here's the output from systat mbuf: 1 users Load 0.65 0.79 0.76 Wed Dec 7 18:15:12 2011 IFACE LIVELOCKS SIZE ALIVE LWM HWM CWM System 0 256 3716 242 2k 3686 1867 lo0 em0 2k 21 4 256 21 em1 2k 20 4 256 20 em2 2k 14 4 256 14 enc0 vether0 tun0 bridge0 pflog0 I did update the kernel at the same time as changing the bios settings, so that led me down the wrong path I think. Digging through /var/log/messages* it looks as though things changed when I upgraded from the October 6th snapshot to the November 15th snapshot. When I was running this (and previous snapshots): OpenBSD 5.0-current (GENERIC.MP) #96: Thu Oct 6 16:12:43 MDT 2011 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP ...I had a bunch of these errors (but no network lockups): pf: state key linking mismatch! dir=OUT, if=em1, stored af=2, a0: 76.126.243.211:25619, a1: 192.168.10.2:49200, proto=17, found af=2, a0: 176.15.107.37:45022, a1: 239.190.175.222:61374, proto=17 After updating to this (and another update since): OpenBSD 5.0-current (GENERIC.MP) #133: Tue Nov 15 22:08:20 MST 2011 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP ...I now have these warnings (and the network lockups): WARNING: mclpools limit reached; increase kern.maxclusters -Nick On Tue, Dec 6, 2011 at 11:21 AM, Stuart Henderson <s...@spacehopper.org> wrote: > Have you adjusted any other sysctl values? > > What does netstat -m say? Run it once, then again after 30 mins or so. > > What does systat mbuf say? > > Did you update the kernel at the same time as changing bios settings? > If so, what did you run before? (check /var/log/messages*) > > I doubt there's a legitimate reason to increase kern.maxclusters to > 8192 on this system, best I think you can hope for with that is to make > it run for a little longer before crashing. > > > > On 2011-12-06, Nick Templeton <n...@nicktempleton.com> wrote: >> You're right that I had an outdated BIOS, which I've now updated, but >> upon further review I don't think that is/was the culprit. I've since >> had the issue re-surface and this time I noticed many lines like this >> in the dmesg (not sure how I missed it before): >> >> WARNING: mclpools limit reached; increase kern.maxclusters >> >> So I've upped kern.maxclusters to 8192, however, I'm not sure if I >> really should need to. This machine is a firewall/router for my home >> network running a few services (sshd, named, httpd, tomcat) for about >> 5 users. There's also a machine that is running Transmission >> BitTorrent client behind the firewall, maybe that could be the >> culprit? >> >> -Nick >> >> On Fri, Dec 2, 2011 at 9:29 AM, Erling Westenvik >><erling.westen...@gmail.com> wrote: >>> You should try upgrading BIOS. As far as I can tell, it would be version >>> 2.4 as of 8/7/2007. >>> >>> >> http://www.dell.com/support/drivers/us/en/19/DriverDetails/DriverFileFormats? >> DriverId=HY9F0&FileId=2731098639 >>> >>> (I was recently given an Dell Optiplex 755, also intel Core 2 Duo, and I >>> installed OpenBSD 5.0 on it. However, I got all kinds of errors - mainly >>> about memory conflict - and the ATi radeon 2400 wouldn't work properly. >>> Then I realized the BIOS was sixteen versions old (A04) and upgraded it >>> to the latest (A20) which seemed to fix just about everything..) >>> >>> >>> On Fri, Dec 02, 2011 at 08:44:43AM -0600, Nick Templeton wrote: >>>> I have a Dell XPS210 that, after a few days of uptime, stops >>>> responding on the network - no ping, ssh, httpd, or tomcat responses - >>>> I simply get connection resets. I run snapshots on this computer that >>>> I update approximately monthly. This machine had been working well for >>>> many months then I decided to tweak some BIOS settings, particularly I >>>> turned on SpeedStep so I could use apmd(8) in "cool running mode >>>> (-C)," I made some other tweaks in the BIOS at the time that I can't >>>> exactly recall, but seemed inconsequential - things like what to do >>>> after a power outage, boot order, etc. After making these changes in >>>> the BIOS is when this issue arose. I've since tried putting the BIOS >>>> settings back the way (I thought) they were, but it hasn't made a >>>> difference, so I don't know if that was really the issue. I'm not >>>> quite sure what to grab for diagnostic info, but there's a few odd >>>> lines I've noticed in the dmesg: >>>> >>>> ... >>>> RTC BIOS diagnostic error 11<memory_size> >>>> ... >>>> ioapic0 at mainbus0: apid 8 pa 0xfec00000, version 20, 24 pins >>>> ioapic0: misconfigured as apic 0, remapped to apid 8 >>>> ... >>>> >>>> Anybody have any ideas? >>>> >>>> -Nick >>>> >>>> OpenBSD 5.0-current (GENERIC.MP) #146: Mon Nov 28 16:07:10 MST 2011 >>>> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP >>>> RTC BIOS diagnostic error 11<memory_size> >>>> real mem = 4216655872 (4021MB) >>>> avail mem = 4090273792 (3900MB) >>>> mainbus0 at root >>>> bios0 at mainbus0: SMBIOS rev. 2.3 @ 0xf0450 (71 entries) >>>> bios0: vendor Dell Inc. version "2.1.2" date 12/01/2006 >>>> bios0: Dell Inc. Dell DXC061 >>>> acpi0 at bios0: rev 2 >>>> acpi0: sleep states S0 S3 S4 S5 >>>> acpi0: tables DSDT FACP SSDT APIC BOOT MCFG HPET DUMY SLIC >>>> acpi0: wakeup devices VBTN(S4) PCI0(S5) PCI4(S5) PCI2(S5) PCI3(S5) >>>> PCI1(S5) PCI5(S5) PCI6(S5) MOU_(S3) USB0(S3) USB1(S3) USB2(S3) >>>> USB3(S3) USB4(S3) >>>> acpitimer0 at acpi0: 3579545 Hz, 24 bits >>>> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat >>>> cpu0 at mainbus0: apid 0 (boot processor) >>>> cpu0: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz, 1862.27 MHz >>>> cpu0: >> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS >> H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3 >> ,CX16,xTPR,PDCM,NXE,LONG >>>> cpu0: 2MB 64b/line 8-way L2 cache >>>> cpu0: apic clock running at 266MHz >>>> cpu1 at mainbus0: apid 1 (application processor) >>>> cpu1: Intel(R) Core(TM)2 CPU 6300 @ 1.86GHz, 1862.02 MHz >>>> cpu1: >> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUS >> H,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3 >> ,CX16,xTPR,PDCM,NXE,LONG >>>> cpu1: 2MB 64b/line 8-way L2 cache >>>> ioapic0 at mainbus0: apid 8 pa 0xfec00000, version 20, 24 pins >>>> ioapic0: misconfigured as apic 0, remapped to apid 8 >>>> acpimcfg0 at acpi0 addr 0xe0000000, bus 0-255 >>>> acpihpet0 at acpi0: 14318179 Hz >>>> acpiprt0 at acpi0: bus 3 (PCI4) >>>> acpiprt1 at acpi0: bus 2 (PCI2) >>>> acpiprt2 at acpi0: bus -1 (PCI3) >>>> acpiprt3 at acpi0: bus 1 (PCI1) >>>> acpiprt4 at acpi0: bus -1 (PCI5) >>>> acpiprt5 at acpi0: bus -1 (PCI6) >>>> acpiprt6 at acpi0: bus 0 (PCI0) >>>> acpicpu0 at acpi0 >>>> acpicpu1 at acpi0 >>>> acpibtn0 at acpi0: VBTN >>>> memory map conflict 0xbf655c00/0x9aa400 >>>> pci0 at mainbus0 bus 0 >>>> pchb0 at pci0 dev 0 function 0 "Intel 82G965 Host" rev 0x02 >>>> ppb0 at pci0 dev 1 function 0 "Intel 82G965 PCIE" rev 0x02: msi >>>> pci1 at ppb0 bus 1 >>>> em0 at pci1 dev 0 function 0 "Intel PRO/1000 MT (82574L)" rev 0x00: >>>> msi, address 00:1b:21:ab:bf:ca >>>> vga1 at pci0 dev 2 function 0 "Intel 82G965 Video" rev 0x02 >>>> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) >>>> wsdisplay0: screen 1-5 added (80x25, vt100 emulation) >>>> intagp0 at vga1 >>>> agp0 at intagp0: aperture at 0xc0000000, size 0x10000000 >>>> inteldrm0 at vga1: apic 8 int 16 >>>> drm0 at inteldrm0 >>>> "Intel 82G965 Video" rev 0x02 at pci0 dev 2 function 1 not configured >>>> em1 at pci0 dev 25 function 0 "Intel ICH8 IFE" rev 0x02: msi, address >>>> 00:16:76:c1:5b:1f >>>> uhci0 at pci0 dev 26 function 0 "Intel 82801H USB" rev 0x02: apic 8 int 16 >>>> uhci1 at pci0 dev 26 function 1 "Intel 82801H USB" rev 0x02: apic 8 int 17 >>>> ehci0 at pci0 dev 26 function 7 "Intel 82801H USB" rev 0x02: apic 8 int 22 >>>> usb0 at ehci0: USB revision 2.0 >>>> uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1 >>>> azalia0 at pci0 dev 27 function 0 "Intel 82801H HD Audio" rev 0x02: msi >>>> azalia0: codecs: Conexant/0x2bfa, Sigmatel STAC9227X, using Sigmatel >> STAC9227X >>>> audio0 at azalia0 >>>> ppb1 at pci0 dev 28 function 0 "Intel 82801H PCIE" rev 0x02: msi >>>> pci2 at ppb1 bus 2 >>>> em2 at pci2 dev 0 function 0 "Intel PRO/1000 MT (82574L)" rev 0x00: >>>> msi, address 00:1b:21:ab:d3:53 >>>> uhci2 at pci0 dev 29 function 0 "Intel 82801H USB" rev 0x02: apic 8 int 23 >>>> uhci3 at pci0 dev 29 function 1 "Intel 82801H USB" rev 0x02: apic 8 int 17 >>>> uhci4 at pci0 dev 29 function 2 "Intel 82801H USB" rev 0x02: apic 8 int 18 >>>> ehci1 at pci0 dev 29 function 7 "Intel 82801H USB" rev 0x02: apic 8 int 23 >>>> usb1 at ehci1: USB revision 2.0 >>>> uhub1 at usb1 "Intel EHCI root hub" rev 2.00/1.00 addr 1 >>>> ppb2 at pci0 dev 30 function 0 "Intel 82801BA Hub-to-PCI" rev 0xf2 >>>> pci3 at ppb2 bus 3 >>>> "TI TSB43AB22 FireWire" rev 0x00 at pci3 dev 10 function 0 not configured >>>> pcib0 at pci0 dev 31 function 0 "Intel 82801HH LPC" rev 0x02 >>>> ahci0 at pci0 dev 31 function 2 "Intel 82801H AHCI" rev 0x02: msi, AHCI >> 1.1 >>>> scsibus0 at ahci0: 32 targets >>>> sd0 at scsibus0 targ 0 lun 0: <ATA, SAMSUNG SP2504C, VT10> SCSI3 >>>> 0/direct fixed t10.ATA_SAMSUNG_SP2504C_S09QJ1SP112542 >>>> sd0: 238418MB, 512 bytes/sector, 488281250 sectors >>>> cd0 at scsibus0 targ 1 lun 0: <TSSTcorp, CDRWDVD TSL462D, DE10> ATAPI >>>> 5/cdrom removable >>>> ichiic0 at pci0 dev 31 function 3 "Intel 82801H SMBus" rev 0x02: apic 8 int >> 20 >>>> iic0 at ichiic0 >>>> spdmem0 at iic0 addr 0x50: 1GB DDR2 SDRAM non-parity PC2-5300CL5 >>>> spdmem1 at iic0 addr 0x51: 1GB DDR2 SDRAM non-parity PC2-5300CL5 >>>> spdmem2 at iic0 addr 0x52: 1GB DDR2 SDRAM non-parity PC2-5300CL5 >>>> spdmem3 at iic0 addr 0x53: 1GB DDR2 SDRAM non-parity PC2-5300CL5 >>>> usb2 at uhci0: USB revision 1.0 >>>> uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1 >>>> usb3 at uhci1: USB revision 1.0 >>>> uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1 >>>> usb4 at uhci2: USB revision 1.0 >>>> uhub4 at usb4 "Intel UHCI root hub" rev 1.00/1.00 addr 1 >>>> usb5 at uhci3: USB revision 1.0 >>>> uhub5 at usb5 "Intel UHCI root hub" rev 1.00/1.00 addr 1 >>>> usb6 at uhci4: USB revision 1.0 >>>> uhub6 at usb6 "Intel UHCI root hub" rev 1.00/1.00 addr 1 >>>> isa0 at pcib0 >>>> isadma0 at isa0 >>>> pckbc0 at isa0 port 0x60/5 >>>> pckbd0 at pckbc0 (kbd slot) >>>> pckbc0: using irq 1 for kbd slot >>>> wskbd0 at pckbd0: console keyboard, using wsdisplay0 >>>> pcppi0 at isa0 port 0x61 >>>> spkr0 at pcppi0 >>>> mtrr: Pentium Pro MTRR support >>>> uhidev0 at uhub6 port 1 configuration 1 interface 0 "Logitech USB >>>> Receiver" rev 1.10/10.20 addr 2 >>>> uhidev0: iclass 3/1 >>>> ukbd0 at uhidev0: 8 modifier keys, 6 key codes >>>> wskbd1 at ukbd0 mux 1 >>>> wskbd1: connecting to wsdisplay0 >>>> uhidev1 at uhub6 port 1 configuration 1 interface 1 "Logitech USB >>>> Receiver" rev 1.10/10.20 addr 2 >>>> uhidev1: iclass 3/0, 2 report ids >>>> uhid0 at uhidev1 reportid 1: input=2, output=0, feature=0 >>>> uhid1 at uhidev1 reportid 2: input=1, output=0, feature=0 >>>> vscsi0 at root >>>> scsibus1 at vscsi0: 256 targets >>>> softraid0 at root >>>> scsibus2 at softraid0: 256 targets >>>> root on sd0a (4a10c7c95af7b910.a) swap on sd0b dump on sd0b