Hi all,
I've got a couple of Pf servers running:
2.2.6-RELEASE (amd64)
built on Mon Dec 21 14:50:08 CST 2015
FreeBSD 10.1-RELEASE-p25
on
Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
4 CPUs: 2 package(s) x 2 core(s)
Memory usage
5% of 4059 MB (fw1)
Memory usage
7% of 3035 MB (fw2)
configured in CARP for High Availability, both with 5 Ethernet ports (bce and
igb).
Both the firewalls have been working like a charm for a long time while
they were set up with 32 bit versions.
The official PfSense documentation suggests to use 64 bit releases on 64
bit systems, so, a few months ago, I upgraded both from 2.2.4 (config. rev.
11.9) 32 bit to the release listed above.
To change to 64 bit release I saved the configuration files, re-installed from
scratch the 64-bit version and then restored the saved configuration files.
As soon as reinstalled the firewalls begun to crash as soon as they were
up. So I added
kern.ipc.nmbclusters100
in System: Advanced: System Tunables and the problem was solved.
Then both the firewalls are unstable: they crash frequently.
fw2 (backup) crashes several times everyday, while fw1 crashes every
couple of days.
My first though is that the crashes are related to hardware issues, but this
doesn't explain why the problem begun after upgrading to 2.2.6 64 bit,
and why the behaviour is similar, but not the same, on both the firewalls:
the probability of hardware failure on both at the same time is very low...
Looking at some documentation online, I have also added the following
lines to
/boot/loader.conf.local
hw.bce.tso_enable=0
hw.pci.enable_msix=0
hw.pci.enable_msi=0
net.inet.tcp.tso=0
so at the moment the file is:
kern.cam.boot_delay=1
legal.intel_ipw.license_ack=1
legal.intel_iwi.license_ack=1
hw.bce.tso_enable=0
hw.pci.enable_msix=0
hw.pci.enable_msi=0
net.inet.tcp.tso=0
I have a huge Crash report file (44 reports for fw2 and 4 for fw1) but I
thing they are too great to be posted here...
Here is an example of the last lines of a crash report:
_
Fatal trap 9: general protection fault while in kernel mode
cpuid = 1; apic id = 01
instruction pointer = 0x20:0x80b30bd3
stack pointer = 0x28:0xfe003be5da60
frame pointer = 0x28:0xfe003be5da90
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (irq18: igb2 bce0+)
�
���
���
���
���
���
���version.txt��
���
06000
���0���275�12661061373� 7624�
���
���
���ustar���root
wheel
���
���
���
���
��Fr
eeBSD 10.1-RELEASE-p25 #0 c39b63e(releng/10.1)-dirty: Mon Dec 21
15:20:13 CST 2015
root@pfs22-amd64-
builder:/usr/obj.RELENG_2_2.amd64/usr/pfSensesrc/src.RELENG_2_2/sys/pfS
ense_SMP.10
���
���
���
���
���
���
���
���
���
_
What I see is that from the crash report file on fw2 I have 21 times on 44
the same value 0x28:0xfe003be5da60 for the stack pointer and 14
times 0x28:0xfe003be5da90 for the frame pointer, and almost always,
but non always, the value of current process is related to bce0
Here's the list of the last crashes and my notes:
2016.02.02
Crashes:
fw1 ~11:00
fw2 ~17:00
___
pfSense mailing list
https://lists.pfsense.org/mailman/listinfo/list
Support the project with Gold! https://pfsense.org/gold