Alain,
I am not sure exactly what is the problem here. But suggest you to save crash
log and
analyze that.
For Solaris 6, you need to uncomment the lines in /etc/init/syssetup :
========================================
##
## Default is to not do a savecore
##
#if [ ! -d /var/crash/`uname -n` ]
#then mkdir -m 0700 -p /var/crash/`uname -n`
#fi
# echo 'checking for crash dump...\c '
#savecore /var/crash/`uname -n`
# echo ''
=========================================
For Solaris 7,, savecore is already enabled by default, and managed by dumpadm (you
can do
goodies like save crashdump on other than swap etc.
Well in any case you need to get the crash dump somewhere in /var/crash/<hostname>
(specified in /etc/dumpadm.conf) and get hold of these two files :
vmcore.n unix.n (where n is the nth crashed dump files).
Now you can run crash utility to do fun stuff on these core files, these are the system
image when the machine panics. So my suggestion is:
(Following is what I would do in Solaris 6, but will be applicable to Solaris 7 also,
see
crash man page *MUST* )
-> After panic boot system, run command dmesg (which will give last console message)
and
find out reason for panic and which process caused the panic to the system. Note down
its
PID.
-> Run crash utility (let's say on 8th dump)
crash -d vmcore.8 -n unix.8 (This will give you > prompt, where you run
some special
command to find out)
(You may want to define output file for all your crash
session also see -w option)
Ex: To see the output of process table
> p -e (This will show listing as ps -ef command).
> p -e
PROC TABLE SIZE = 7978
SLOT ST PID PPID PGID SID UID PRI NAME FLAGS
0 t 0 0 0 0 0 96 sched load sys lock
1 s 1 0 0 0 0 58 init load
2 s 2 0 0 0 0 98 pageout load sys lock nowait
3 s 3 0 0 0 0 60 fsflush load sys lock nowait
4 s 1055 1 1055 1055 0 58 sac load jctl
(observe this and see the PID of you process here, which caused the
crash. Sometimes it
is the child of some
parent process(PPID). Where is the PID of this PPID and identify this
process. This way
you can trace down the
culprit process.)
(Example: a copy(cp) process , might have caused panic to you system (you find
out this
from dmesg output)
find that process in crash dump >p -e output.
So PID= 4957 PPID=4955
93 p 4957 4955 452 452 225 60 cp load
which is invoked from some shell process. (PID=4955, PPID=4949)
107 s 4955 4949 452 452 225 29 sh load
and so on. keep tracing and you will find final process.
The last process in tracing would be (in general) init process with PID 1 (this is the
very first process in Solaris starts).
This way you can find more intresting stuff with crashdump. I stongly recommend reading
crash man page, since it Sun changes crash dump analysis based on every kernel
release. I
did crash dump on Solaris 6, not on Solaris 7, but but extrapolating above I can try
doing
on Solaris 7, although still waiting for Solaris 7 to crash somewhere with in my reach.
Hope this information helps.!!
Good Luck!
Rajeev
Alain Fauconnet wrote:
>
> Hello,
>
> I'm very sorry if this is a repost, but from the list archives it seems
> obvious that my two previous attempts have not made it to the list, so I
> try again.
>
> We have CP-FW1 v4.1 (base):
>
> # fw ver
> This is Check Point VPN-1(TM) & FireWall-1(R) Version 4.1 Build 41439
> [VPN]
>
> running on a Sun Ultra-10 machine (128Mb memory) with a quad-Ethernet
> (qfe) card. O/S is Solaris 7 with recommended + relevant (qfe driver)
> patches installed.
>
> This machine is giving us much worries. It panics up to 4 times a day,
> always with the same kind of message:
>
> Jun 1 17:23:02 saturn unix: fw_send: NULL q (700c2320)
> Jun 1 17:32:04 saturn unix: fw_lock: already locked. current =
> fw_filter (in),
> previous = fw_filter (in), level=2
> Jun 1 17:32:05 saturn unix: FW-1: panic(1): fw_lock
>
> Then it goes:
>
> Jun 1 17:32:01 saturn unix: BAD TRAP: cpu=0 type=0x31 rp=0x704f6c78
> addr=0x24 mmu_fsr=0x0
> Jun 1 17:32:01 saturn unix: BAD TRAP occurred in module "fw" due to a
> NULL pointer dereference.
>
> Hardware failure has been ruled out, because the disk has been moved to
> another Ultra-10 which showed the same panics. File corruption is very
> unlikely (all packages including CP-FW1 itself have been pkgchk'ed)
>
> The machine has had a fresh Sol. 7 install, no would-be-guru system
> tuning and no unusual things running on it.
> Support from the local CP dealer has been... well... almost unexistent,
> so I'm sending this message in a bottle, hoping that someone on the list
> has heard of such problems.
>
> Would this be a known problem fixed in SP-1 ?
>
> By the way if I ever manage to get SP-1 from our local distributor (I'm
> almost despaired, they've been so clueless so far), will I need a new
> key or will my 4.1 "base" key work ?
>
> Any hints will be GREATLY appreciated, this is becoming my nightmare.
>
> Alain Fauconnet
> Sr. Unix Sysadmin
> CS Communications Co. Ltd.
> Thailand
>
> ================================================================================
> To unsubscribe from this mailing list, please see the instructions at
> http://www.checkpoint.com/services/mailing.html
> ================================================================================
--
#########################################################################
(Titanic creators used Linux to simulate the sinking of the great ship)
#########################################################################
Rajeev Kumar ([EMAIL PROTECTED])
Fluent Inc. 10, Cavendish Court, Lebanon NH-03766
-------------------------------------------------------------------------
Phone :: (603)-643-2600 x 349 Fax :: (603)-643-3967
Web:: http://www.fluent.com
#########################################################################
================================================================================
To unsubscribe from this mailing list, please see the instructions at
http://www.checkpoint.com/services/mailing.html
================================================================================