Hi,
I have one workstation(hp xw4300) , with Solaris 10 (x86) and one Digi Sync570i
card.
The system may hangs at any time, from a few minutes to a couple of hours, when
the card is receiving data frames.
I doubt the system hanging is caused by the driver module for Sync570, however,
the same driver works properly on solaris 8 system.
We used to install Solaris 8 on HP xw4100, but now we have to install Solaris
10 on HP xw4300.(we cant get HP xw4100 in the market)
I use kmdb to load solaris system. After the system hangs I can't ping the
host. And the keyboard and mouse have no reponses.
I can get the crashdump file by pressing "F1+A" and then input "$" due to a NULL pointer dereference
sched:
#pf Page fault
Bad kernel fault at addr=0x0
pid=0, pc=0x0, sp=0x202, eflags=0x10002
cr0: 8005003b cr4: 6f8
cr2: 0 cr3: 4226000
gs: 1b0 fs:0 es: 160 ds: 160
edi: d2f50a60 esi: fef4b2a8 ebp: d2c84d34 esp: d2c84d1c
ebx: d2f54180 edx: d2f541f8 ecx: 1f eax: fed6c870
trp:e err: 10 eip:0 cs: 158
efl:10002 usp: 202 ss: d2c84d3c
d2c84c4c unix:die+a7 (e, d2c84cec, 0, 0)
d2c84cd8 unix:trap+f56 (d2c84cec, 0, 0)
d2c84cec unix:cmntrap+83 ()
d2c84d34 0 (d2c84d44, fe81189a,)
d2c84d3c genunix:kdi_dvec_enter+a (d2c84d50, fe81183c,)
d2c84d44 unix:debug_enter+32 (0)
d2c84d50 unix:abort_sequence_enter+27 (0)
d2c84d64 kbtrans:kbtrans_streams_key+3e (d2f54180, 1f, 0)
d2c84d88 kb8042:kb8042_received_byte+b2 (fef4b1a8, 1e)
d2c84da0 kb8042:kb8042_intr+65 (fef4b1a8)
d2c84db8 i8042:i8042_intr+a4 (d2f50980)
::cpuinfo -v
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
0 fec20ae4 1b80 104 nono t-740847 d2c84de0 sched
|||
RUNNING <--+|+--> PIL THREAD
READY | 5 d2c84de0
EXISTS | 3 d2ca0de0
ENABLE | - d2c28de0 (idle)
|
+--> PRI THREAD PROC
99 d2c9ade0 sched
99 d2c97de0 sched
60 d3264a00 fsflush
60 d2e1ade0 sched
60 d2e37de0 sched
60 d4644de0 sched
60 d96dcde0 sched
59 d38e7400 Xsun
d2c84de0::thread
ADDRSTATE FLG PFLG SFLG PRI EPRI PIL INTR DISPTIME BOUND PR
d2c84de0 onproc80903 104 0 5 d2ca0de00-1 2
d2ca0de0::thread
ADDRSTATE FLG PFLG SFLG PRI EPRI PIL INTR DISPTIME BOUND PR
d2ca0de0 onproc 903 102 0 3 d2c28de046a51-1 1
d2ca0de0::findstack -v
stack pointer for thread d2ca0de0: d2ca0c2c
d2ca0de0 0xd94c62bc()
After I pressed "F1+A"?the kernel created the thread "d2c84de0" to give
responses to keyboard interruption(PIL = 5, PRI= 104).
but another thread "d2ca0de0",at same time, is still running on CPU. ( PIL = 3
, PRI = 102 ).
I guess one event may causes the kernel to create the thread d2ca0de0 , but
then the kernel hangs, until I have pressed "F1+A" , the kernel creates
another thead d2c84de0 , and finally crashed down.
I have no idea what causes the kernel to create thread d2ca0de0
(PRI=102,PIL=3)?
[[ Q3 ]]
::cpuinfo
ID ADDR FLG NRUN BSPL PRI RNRN KRNRN SWITCH THREAD PROC
0 fec20ae4 1b80 104 nono t-740847 d2c84de0 sched
::cycinfo -v
CPU CYC_CPU STATE NELEMS ROOTFIRE HANDLER
0 d9aabe00 online 4 d9aabd80 96b6b848e80 clock
2
|
+--+--+
0 1
| |
+-++ +-+-+
3
|
+++
ADDR NDX HEAP LEVL PENDFIRE USECINT HANDLER
d9aabd80 01 high 0 96b6b848e80 1 cbe_hres_tick
d9aabda0 12 low 74125396b6b848e80 1 apic_redistribute_compute
d9aabdc0 20 lock 406 96b6b848e80 1 clock
d9aabde0 33 high 0 96b6d4e5200 100 deadman
---
The value of SWITCH of thread d2c84de0 is 740847 ;
The value of PEND of apic_redistribute_compute is 741253 ;
The value of PEND of clock is 406 .
(741253 - 406) == 740847
What does it mean ? Could you please account for it ?
[[ Q4 ]]
::ipcs
Message queues:
failed to read 'msq_svc'; module not present
Shared memory:
ADDR REFID KEY MODE PRJID ZONEID OWNER GROUP CREAT CGRP
d4