Hi, I have a rac 9.2.0.2 Database running on RedHat AS2.1,it has been up and running for about 4 monthes(One month ago, i patched it to 9202). Last night, one instance died unexpectedly,while another instance still running.Though not much business is affected, I want to know why it died, But i am unable to find it out, so looking for your help. Here is some information, I tested the interconnect and the service Network card, both running fine(at 7:00 am), and the disk system is also ok.
Alert log file: quote: -------------------------------------------------------------------------------- Fri Dec 20 23:38:24 2002 Thread 2 advanced to log sequence 265 Current log# 3 seq# 265 mem# 0: /dev/raw/raw8 Sat Dec 21 04:04:29 2002 Errors in file /home/oracle/admin/rac/bdump/rac2_lmon_1634.trc: ORA-29740: evicted by member 0, group incarnation 7 Sat Dec 21 04:04:29 2002 LMON: terminating instance due to error 29740 Sat Dec 21 04:04:31 2002 Trace dumping is performing id=[cdmp_20021221040431] Sat Dec 21 04:04:34 2002 Instance terminated by LMON, pid = 1634 Sat Dec 21 07:38:36 2002 Starting ORACLE instance (normal) Sat Dec 21 07:38:36 2002 -------------------------------------------------------------------------------- and this is from the trace file: quote: -------------------------------------------------------------------------------- [oracle@rac2 bdump]$ cat /home/oracle/admin/rac/bdump/rac2_lmon_1634.trc /home/oracle/admin/rac/bdump/rac2_lmon_1634.trc Oracle9i Enterprise Edition Release 9.2.0.2.0 - Production With the Partitioning, Real Application Clusters, OLAP and Oracle Data Mining options JServer Release 9.2.0.2.0 - Production ORACLE_HOME = /home/oracle/9.2.0 System name: Linux Node name: rac2 Release: 2.4.9-e.3smp Version: #1 SMP Fri May 3 16:48:54 EDT 2002 Machine: i686 Instance name: rac2 Redo thread mounted by this instance: 0 Oracle process number: 4 Unix process pid: 1634, image: oracle@rac2 (LMON) *** SESSION ID:(3.1) 2002-11-22 03:29:38.649 Batch msg size = 2048 Batching factor: enqueue replay 48, ack 53 Batching factor: cache replay 34 size per lock 56 kjxggin: receive buffer size = 32768 kjxgmin: SKGXN ver (2 1 Oracle 9i Reference CM) CMCLI WARNING: CMInitContext: init ctx(0xacc37e8) *** 2002-11-22 03:29:42.243 kjxgmrcfg: Reconfiguration started, reason 1 kjxgmcs: Setting state to 0 0. *** 2002-11-22 03:29:42.243 Name Service frozen kjxgmcs: Setting state to 0 1. kjfcpiora: publish my weight 152022 kjxgmps: proposing substate 2 kjxgmcs: Setting state to 6 2. Performed the unique instance identification check kjxgmps: proposing substate 3 kjxgmcs: Setting state to 6 3. Name Service recovery started Deleted all dead-instance name entries kjxgmps: proposing substate 4 kjxgmcs: Setting state to 6 4. Multicasted all local name entries for publish Replayed all pending requests kjxgmps: proposing substate 5 kjxgmcs: Setting state to 6 5. Name Service normal Name Service recovery done *** 2002-11-22 03:29:43.397 kjxgmps: proposing substate 6 kjxgmcs: Setting state to 6 6. *** 2002-11-22 03:29:43.507 *** 2002-11-22 03:29:43.508 Reconfiguration started Synchronization timeout interval: 660 sec List of nodes: 0,1, Global Resource Directory frozen node 0 release 9 2 0 2 node 1 release 9 2 0 2 res_master_weight for node 0 is 152022 res_master_weight for node 1 is 152022 Total master weight = 304044 Dead inst Join inst 0 1 Exist inst Active Sendback Threshold = 50 % Communication channels reestablished Master broadcasted resource hash value bitmaps Non-local Process blocks cleaned out Resources and enqueues cleaned out Resources remastered 0 0 GCS shadows traversed, 0 cancelled, 0 closed 0 GCS resources traversed, 0 cancelled set master node info Submitted all remote-enqueue requests kjfcrfg: Number of mesgs sent to node 0 = 0 Update rdomain variables Dwn-cvts replayed, VALBLKs dubious All grantable enqueues granted *** 2002-11-22 03:29:43.868 0 GCS shadows traversed, 0 replayed, 0 unopened Submitted all GCS cache requests 0 write requests issued in 887 GCS resources 0 PIs marked suspect, 0 flush PI msgs *** 2002-11-22 03:29:44.116 Reconfiguration complete *** 2002-11-22 03:29:51.261 kjxgrtmc2: Member 1 thread 2 mounted *** 2002-12-21 04:02:05.645 kjxgrgetresults: Detect reconfig from 0, seq 6, reason 2 *** 2002-12-21 04:01:57.014 kjxgrrcfgchk: Initiating reconfig, reason 2 *** 2002-12-21 04:01:57.014 kjxgmrcfg: Reconfiguration started, reason 2 kjxgmcs: Setting state to 6 0. *** 2002-12-21 04:01:57.021 Name Service frozen kjxgmcs: Setting state to 6 1. *** 2002-12-21 04:04:29.911 kjxgrdtrt: Evicted by 0, seq (7, 6) error 29740 detected in background process ORA-29740: evicted by member 0, group incarnation 7 ksuitm: waiting for [5] seconds before killing DIAG -- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: chao_ping INET: [EMAIL PROTECTED] Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: [EMAIL PROTECTED] (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).