There are many things you can do to diagnose a system that is totally
unresponsive.

You should spool output from v$session_wait joined to v$session every
minute so that you have a record of what people started getting hung
up on leading up to the hang.  This might allow you to trigger a
systemstate or hanganalyze dump programatically upon detection of a
particular pileup of waits, but before the server becomes completely
hung.

Also, you should identify a typical hung server process (such as one
that you were logged in with in SQL*Plus before the hang, and now is
hanging), attach to it using a debugger such as gdb, and obtain a
backtrace.

$ gdb $ORACLE_HOME/bin/oracle <pid>
...
(gdb) bt

With that information, you can tell Oracle Support what calls your
processes are getting hung up in.

Another tactic you should take is to use the ps command look at which
oracle processes are getting CPU time.

$ ps -eo pid,pcpu,state,command | sort -n +1

If any are getting CPU time, you can use a system call trace utility
such as strace, tusc or truss to find what that process is doing.
Additionally, you can obtain a backtrace (see below) of any processes
using up CPU time while the instance is hanging.  This information can
be forwarded to Oracle Support.

BTW, Oracle Support should have already suggested all these things to
you.  They are standard hang diagnosis steps.

--
Jeremiah Wilton
http://www.speakeasy.net/~jwilton

On Tue, 19 Mar 2002, Grabowy, Chris wrote:

> Here is the latest on the Oracle hangs issue from the DBA working it...
> --------------------------------------------------------------
> In 8.1.7.2 we had frequent crashes, and Oracle told us to upgrade to 8.1.7.3
> to solve that.  The crashes then turned into these hangs.  The problem is
> our hangs are total freezes - we can't log on, we can't do anything with
> sessions already logged on - nothing.  That means we are not able to run
> Statspack or Hanganalyze or anything like that help us figure out what's
> causing it.  If you have any other ideas we're open to anything.
> 
> -----Original Message-----
> Sent: Monday, March 18, 2002 1:28 PM
> To: Multiple recipients of list ORACLE-L
> 
> We have a particular database that hangs on a regular basis.  Here are the
> stats and symptoms.
> 
> Oracle stats
> ------------------------------------------
> 8.1.7.3 (highest patch level applied)
> Solaris 2.7
> UTF8 character set
> 
> Symptoms
> ------------------------------------------
> Random hanging.
>      Hanging meaning SQL processing stops.
>      New connections "hang".
> No traces files.
> No messages in the alert log.
> Killing the Oracle processes is the only way to recover from the problem.
> 
> This problem has been reported to Oracle Support, they are now escalating it
> up.

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.com
-- 
Author: Jeremiah Wilton
  INET: [EMAIL PROTECTED]

Fat City Network Services    -- (858) 538-5051  FAX: (858) 538-5051
San Diego, California        -- Public Internet access / Mailing Lists
--------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).

Reply via email to