Hi,
We're having intermittent panics on our Solaris 9 systems running IP
Filter v3.4.35 as a 'host firewall'. Sun uses the "Not our problem"
technique when "supporting" us on this issue, so I'm asking here if
anyone has clues as to the underlying issue/bug.
IPF 3.4.35 was compiled on Solaris 9, using gcc and aside from these
intermittent kernel panics, seems to work just fine. The problem seems
only to occur after the device has been up for a few days or more, and
there seems to be no abnormal network activity associated with the
crashes. I'm eagerly awaiting a Solaris 10 upgrade, so they _will_
support IPF, but until then, I'm asking if anyone else has seen this
problem, or has any suggestions to prevent it from occurring.
Regards,
Bill
-----snippet of core analysis from Sun-----
--vmcore.2--
Summary: The system panicked due to a problem within IP Filter v3.4.35
(3rd party driver). Since the "ipf" driver is not supported in Solaris 9
or older, please contact the vendor of the driver for further analysis.
core file: /cores/dir0/64821844/vmcore.2
user: Ning Bell (nb106596:107596)
release: 5.9 (64-bit)
version: Generic_118558-02
machine: sun4u
hw_provider: Sun_Microsystems
system type: SUNW,Sun-Fire-V440 (UltraSPARC-IIIi)
time of crash: Mon Nov 21 12:50:08 MST 2005 age of system: 10 days 8
hours 2 minutes 48.82 seconds
panic CPU: 1 (4 CPUs, 8G memory)
panic string: BAD TRAP: type=31 rp=2a100185090 addr=81a4000100000018
mmu_fsr=0
Mon Nov 21 12:50:08 2005|
| panic[cpu1]/thread=2a100185d40:
Mon Nov 21 12:50:08 2005| BAD TRAP: type=31 rp=2a100185090
addr=81a4000100000018 mmu_fsr=0
SolarisCAT(vmcore.2/9U)> panic
panic on cpu 1
panic string: BAD TRAP: type=31 rp=2a100185090 addr=81a4000100000018
mmu_fsr=0
==== panic kernel thread: 0x2a100185d40 PID: 0 on CPU: 1 ====
cmd: sched
t_procp: 0x1438a78(proc_sched) p_as: 0x1438960(kas)
t_stk: 0x2a100185b50 sp: 0x1437cb1 t_stkbase: 0x2a100182000
t_pri: 60(SYS) pctcpu: 0.000000 t_lwp: 0x0
psrset: 0 last CPU: 1
idle: 11 ticks (0.11 seconds)
start: Fri Nov 11 04:47:56 2005
age: 892932 seconds (10 days 8 hours 2 minutes 12 seconds)
stime: 4300 (10 days 8 hours 2 minutes 5.82 seconds earlier)
tstate: TS_ONPROC - thread is being run on a processor
tflg: T_TALLOCSTK - thread structure allocated from stk
T_PANIC - thread initiated a system panic
tpflg: none set
tsched: TS_LOAD - thread is in memory
TS_DONT_SWAP - thread/LWP should not be swapped
TS_SIGNALLED - thread was awakened by cv_signal()
pflag: SSYS - system resident process
SLOAD - in core
SLOCK - process cannot be swapped
pc: 0x104bd28 unix:panicsys+0x44: call unix:setjmp
startpc: 0x7806f66c ce:ce_drain_fifo+0x0: save %sp, -0x280, %sp
unix:panicsys+0x44(0x105bc90, 0x2a100184e48, 0x1438680, 0x1, 0x0, 0x27,
0x8900001606, , , , , , , , 0x105bc90, 0x2a100184e48)
unix:vpanic+0xcc(0x105bc90, 0x2a100184e48, 0x23, 0x1, 0x8, 0x8)
unix:panic+0x1c(0x105bc90, 0x31, 0x2a100185090, 0x81a4000100000018,
0x0,0x2000)
unix:die+0xa4(0x31, 0x2a100185090, 0x81a4000100000018, 0x0, 0x0, 0x0)
unix:trap+0x874(0x2a100185090, 0x81a4000100000018, , , 0x81a40001, 0x8)
unix:ktl0+0x48()
-- trap data type: 0x31 (data access MMU miss) rp: 0x2a100185090 --
addr: 0x81a4000100000018
pc: 0x780b49a4 ipf:fr_scanlist+0xf4: ldx [%g4 + 0x18], %g1
npc: 0x780b49a8 ipf:fr_scanlist+0xf8: brz,pt %g1,
ipf:fr_scanlist+0x124
global: %g1 0
%g2 0x1 %g3 0x1388
%g4 0x81a4000100000000 %g5 0x81a4000100000000
%g6 0 %g7 0x2a100185d40
out: %o0 0x1fe6eb %o1 0x1
%o2 0x1fe6ea %o3 0x1fe6ea
%o4 0x8 %o5 0x8
%sp 0x2a100184931 %o7 0x780c23fc
loc: %l0 0x300024e0000 %l1 0x2a100185d40
%l2 0x1 %l3 0x1
%l4 0x300003f3ea8 %l5 0
%l6 0x14d8000 %l7 0x14d8000
in: %i0 0x202 %i1 0x3000dd3d710
%i2 0x2a100185380 %i3 0x3003382e840
%i4 0x37 %i5 0x780d8c20
%fp 0x2a100184a51 %i7 0x780b5c0c
<trap>ipf:fr_scanlist+0xf4(0x202, 0x3000dd3d710, 0x2a100185380,
0x3003382e840, 0x37, 0x780d8c20) ipf:fr_check+0x7e8(0x3000dd3d710, 0x14,
0x30002697438, 0x0, 0x2a1001856a0, 0x2a1001858a8)
ipf:fr_precheck+0x1028(0x2a1001858a8, 0x30002701208, 0x2a1001856a0, 0x0,
0x0, 0x27) ipf:fr_qin+0x474(0x30002701208, 0x3003382e840, 0x20, 0x0,
0x1, 0x3003382e840)
unix:putnext+0x21c(0x30002701498, 0x3003382e840, , 0x90, 0x3000dd3d710,
0x3000dd3d702)
ce:ce_drain_fifo+0x50a8(0x30003478e18, 0x0)
unix:thread_start+0x4()
-- end of kernel thread's stack --
SolarisCAT(vmcore.2/9U)> rdi -p fr_scanlist+0xf4 10
ipf:fr_scanlist+0xcc: lduw [%fp + 0x7df], %g1
ipf:fr_scanlist+0xd0: subcc %g1, 0x0, %g0 ( cmp %g1, 0x0
)
ipf:fr_scanlist+0xd4: be,pt %icc, ipf:fr_scanlist+0xf0 (3f)
ipf:fr_scanlist+0xd8: nop
ipf:fr_scanlist+0xdc: lduw [%fp + 0x7df], %g1
ipf:fr_scanlist+0xe0: add %g1, -0x1, %g1 ( inc %g1 )
ipf:fr_scanlist+0xe4: stw %g1, [%fp + 0x7df]
ipf:fr_scanlist+0xe8: ba,pt %xcc, ipf:fr_scanlist+0xb00
(28f)
ipf:fr_scanlist+0xec: nop
ipf:fr_scanlist+0xf0: 3: ldx [%fp + 0x7bf], %g4 <<<
ipf:fr_scanlist+0xf4: ldx [%g4 + 0x18], %g1 <<<
ipf:fr_scanlist+0xf8: brz,pt %g1, ipf:fr_scanlist+0x124 (4f)
SolarisCAT(vmcore.2/9U)> rd 0x2a100184a51+0x7bf 0x2a100185210 =
0x81a4000100000000
Mon Nov 21 12:50:08 2005| addr=0x81a4000100000018
>>> panic address is 0x81a4000100000018 which matches %g1
SolarisCAT(vmcore.2/9U)> modinfo -p ipf
ID flags modctl textaddr size cnt name
107 LIN 0x300003c1490 0x780b0000 0x2c4c8 1 ipf (IP Filter:
v3.4.35)
>>> since Sun doesn't supply and support the "ipf" driver in Solaris 9
or older, the customer will need to contact the vendor of the driver for
a fix
Thanks
--Ning
Sun Microsystems - Broomfield, CO
Kernel Support Engineer - (303) 272-4808 To reach another kernel
engineer: 1800usa4sun, extension 79273# To view your case online:
http://www.sun.com/service/online