On 11/19/08 11:47 PM, "Andrew Beekhof" <[EMAIL PROTECTED]> wrote:
On Thu, Nov 20, 2008 at 02:48, Wolf Noble <[EMAIL PROTECTED]> wrote:
> Hi Gang.
>
> I have a 4 vm node cluster. (centos5)
> I, using the gui, paused one node (db02) in the cluster.
> All resources failed off of it.
>
> I then shutdown heartbeat,
>
> Chkconfig -levels 2345 heartbeat off
>
> yum remove heartbeat-stonith heartbeat-pils heartbeat heartbeat-gui openais
>
> wget
> http://download.opensuse.org/repositories/server:/ha-clustering/CentOS_5/server:ha-clustering.repo
> -O /etc/yum.repos.d/server_ha-clustering.repo
>
> yum --disablerepo base --enablerepo server_ha-clustering install openais
> heartbeat-common heartbeat-resources heartbeat pacemaker pacemaker-pygui
> libopenais2 libpacemaker3
>
> Reboot
>
> Then attempted to start heartbeat, whereupon the box would endlessly reboot
> itself thanks to CIB dumping core (hence the disabling heartbeat at boot)
>
> Any ideas how to rectify?
use "crm respawn" in ha.cf to prevent the rebooting
beyond that, i'd need a stack trace.
with gdb installed, run:
gdb /usr/lib/heartbeat/crmd /var/lib/heartbeat/cores/hacluster/core.2674
(or if core.2674 doesnt exist, gdb /usr/lib/heartbeat/crmd
/var/lib/heartbeat/cores/hacluster/core)
then at the gdb prompt, type: where
and reply with the output
I got similar results irregardless if I used the normal heartbeat/pacemaker, or
if I have heartbeat-debug and pacemaker-debug installed as well.
Here's info with both debug packages installed: (cib.xml is the same as
before, with the same cib.xml.sig)
Thanks!
Nov 20 12:11:18 db02 heartbeat: [8304]: info: Version 2 support: yes
Nov 20 12:11:18 db02 heartbeat: [8304]: WARN: Logging daemon is disabled
--enabling logging daemon is recommended
Nov 20 12:11:18 db02 heartbeat: [8304]: info: **************************
Nov 20 12:11:18 db02 heartbeat: [8304]: info: Configuration validated. Starting
heartbeat 2.99.2
Nov 20 12:11:18 db02 heartbeat: [8305]: info: heartbeat: version 2.99.2
Nov 20 12:11:19 db02 heartbeat: [8305]: info: Heartbeat generation: 1225155169
Nov 20 12:11:19 db02 heartbeat: [8305]: info: glib: UDP Broadcast heartbeat
started on port 694 (694) interface eth1
Nov 20 12:11:19 db02 heartbeat: [8305]: info: glib: UDP Broadcast heartbeat
closed on port 694 interface eth1 - Status: 1
Nov 20 12:11:19 db02 heartbeat: [8305]: info: glib: UDP Broadcast heartbeat
started on port 694 (694) interface eth0
Nov 20 12:11:19 db02 heartbeat: [8305]: info: glib: UDP Broadcast heartbeat
closed on port 694 interface eth0 - Status: 1
Nov 20 12:11:19 db02 heartbeat: [8305]: info: glib: UDP multicast heartbeat
started for group 225.0.0.1 port 694 interface eth0 (ttl=1 loop=0)
Nov 20 12:11:19 db02 heartbeat: [8305]: info: G_main_add_TriggerHandler: Added
signal manual handler
Nov 20 12:11:19 db02 heartbeat: [8305]: info: G_main_add_TriggerHandler: Added
signal manual handler
Nov 20 12:11:19 db02 heartbeat: [8305]: info: G_main_add_SignalHandler: Added
signal handler for signal 17
Nov 20 12:11:19 db02 heartbeat: [8305]: info: Local status now set to: 'up'
Nov 20 12:11:20 db02 heartbeat: [8305]: info: Link web02:eth0 up.
Nov 20 12:11:20 db02 heartbeat: [8305]: info: Status update for node web02:
status active
Nov 20 12:11:20 db02 heartbeat: [8305]: info: Link web01:eth0 up.
Nov 20 12:11:20 db02 heartbeat: [8305]: info: Link db01:eth0 up.
Nov 20 12:11:20 db02 heartbeat: [8305]: info: Status update for node web01:
status active
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Comm_now_up(): updating status to
active
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Local status now set to: 'active'
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Starting child client
"/usr/lib/heartbeat/mgmtd -v" (0,0)
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Starting child client
"/usr/lib/heartbeat/ccm" (498,496)
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Starting child client
"/usr/lib/heartbeat/cib" (498,496)
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Starting child client
"/usr/lib/heartbeat/lrmd -r" (0,0)
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Starting child client
"/usr/lib/heartbeat/stonithd" (0,0)
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Starting child client
"/usr/lib/heartbeat/attrd" (498,496)
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Starting child client
"/usr/lib/heartbeat/crmd" (498,496)
Nov 20 12:11:21 db02 heartbeat: [8305]: info: Status update for node db01:
status active
Nov 20 12:11:21 db02 heartbeat: [8319]: info: Starting "/usr/lib/heartbeat/ccm"
as uid 498 gid 496 (pid 8319)
Nov 20 12:11:21 db02 heartbeat: [8320]: info: Starting "/usr/lib/heartbeat/cib"
as uid 498 gid 496 (pid 8320)
Nov 20 12:11:21 db02 heartbeat: [8321]: info: Starting "/usr/lib/heartbeat/lrmd
-r" as uid 0 gid 0 (pid 8321)
Nov 20 12:11:21 db02 heartbeat: [8323]: info: Starting
"/usr/lib/heartbeat/attrd" as uid 498 gid 496 (pid 8323)
Nov 20 12:11:21 db02 heartbeat: [8322]: info: Starting
"/usr/lib/heartbeat/stonithd" as uid 0 gid 0 (pid 8322)
Nov 20 12:11:21 db02 heartbeat: [8324]: info: Starting
"/usr/lib/heartbeat/crmd" as uid 498 gid 496 (pid 8324)
Nov 20 12:11:21 db02 heartbeat: [8318]: info: Starting
"/usr/lib/heartbeat/mgmtd -v" as uid 0 gid 0 (pid 8318)
Nov 20 12:11:21 db02 stonithd: [8322]: info: G_main_add_SignalHandler: Added
signal handler for signal 10
Nov 20 12:11:21 db02 stonithd: [8322]: info: G_main_add_SignalHandler: Added
signal handler for signal 12
Nov 20 12:11:21 db02 lrmd: [8321]: info: G_main_add_SignalHandler: Added signal
handler for signal 15
Nov 20 12:11:21 db02 crmd: [8324]: info: main: CRM Hg Version: node:
6fc5ce8302abf145a02891ec41e5a492efbe8efe
Nov 20 12:11:21 db02 cib: [8320]: info: G_main_add_SignalHandler: Added signal
handler for signal 15
Nov 20 12:11:21 db02 cib: [8320]: info: G_main_add_TriggerHandler: Added signal
manual handler
Nov 20 12:11:21 db02 cib: [8320]: info: G_main_add_SignalHandler: Added signal
handler for signal 17
Nov 20 12:11:21 db02 cib: [8320]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
/var/lib/heartbeat/crm/cib.xml.sig)
Nov 20 12:11:21 db02 cib: [8320]: ERROR: validate_cib_digest: Digest
comparision failed: expected 29fbc0aa4a0f21ba83a83739b1591f52
(/var/lib/heartbeat/crm/cib.xml.sig), calculated
e82a0fd028b6a545881f66ce5b6c1cd4
Nov 20 12:11:21 db02 cib: [8320]: ERROR: retrieveCib: Checksum of
/var/lib/heartbeat/crm/cib.xml failed! Configuration contents ignored!
Nov 20 12:11:21 db02 cib: [8320]: ERROR: retrieveCib: Usually this is caused by
manual changes, please refer to http://linux-ha.org/v2/faq/cib_changes_detected
Nov 20 12:11:21 db02 cib: [8320]: WARN: retrieveCib: Continuing but
/var/lib/heartbeat/crm/cib.xml will NOT used.
Nov 20 12:11:21 db02 cib: [8320]: ERROR: retrieveCib: Archiving corrupt or
unusable configuration to /var/lib/heartbeat/crm/cib.xml.8320
Nov 20 12:11:21 db02 cib: [8320]: info: archive_file:
/var/lib/heartbeat/crm/cib.xml archived as /var/lib/heartbeat/crm/cib.xml.8320
Nov 20 12:11:21 db02 ccm: [8319]: info: Hostname: db02
Nov 20 12:11:22 db02 lrmd: [8321]: info: G_main_add_SignalHandler: Added signal
handler for signal 17
Nov 20 12:11:21 db02 crmd: [8324]: info: crmd_init: Starting crmd
Nov 20 12:11:22 db02 crmd: [8324]: info: G_main_add_SignalHandler: Added signal
handler for signal 15
Nov 20 12:11:22 db02 crmd: [8324]: info: G_main_add_TriggerHandler: Added
signal manual handler
Nov 20 12:11:21 db02 attrd: [8323]: info: G_main_add_SignalHandler: Added
signal handler for signal 15
Nov 20 12:11:22 db02 attrd: [8323]: info: main: Starting up....
Nov 20 12:11:22 db02 stonithd: [8322]: info: register_heartbeat_conn: Hostname:
db02
Nov 20 12:11:22 db02 stonithd: [8322]: info: register_heartbeat_conn: UUID:
1ebdfff3-30a9-478f-b487-ca6e2122b27f
Nov 20 12:11:22 db02 mgmtd: [8318]: info: G_main_add_SignalHandler: Added
signal handler for signal 15
Nov 20 12:11:22 db02 mgmtd: [8318]: info: G_main_add_SignalHandler: Added
signal handler for signal 10
Nov 20 12:11:22 db02 mgmtd: [8318]: info: G_main_add_SignalHandler: Added
signal handler for signal 12
Nov 20 12:11:22 db02 stonithd: [8322]: notice: /usr/lib/heartbeat/stonithd
start up successfully.
Nov 20 12:11:22 db02 stonithd: [8322]: info: G_main_add_SignalHandler: Added
signal handler for signal 17
Nov 20 12:11:22 db02 crmd: [8324]: info: G_main_add_SignalHandler: Added signal
handler for signal 17
Nov 20 12:11:22 db02 cib: [8320]: info: archive_file:
/var/lib/heartbeat/crm/cib.xml.sig archived as
/var/lib/heartbeat/crm/cib.xml.sig.8320
Nov 20 12:11:22 db02 attrd: [8323]: info: register_heartbeat_conn: Hostname:
db02
Nov 20 12:11:22 db02 attrd: [8323]: info: register_heartbeat_conn: UUID:
1ebdfff3-30a9-478f-b487-ca6e2122b27f
Nov 20 12:11:22 db02 lrmd: [8321]: info: G_main_add_SignalHandler: Added signal
handler for signal 10
Nov 20 12:11:22 db02 lrmd: [8321]: info: G_main_add_SignalHandler: Added signal
handler for signal 12
Nov 20 12:11:22 db02 lrmd: [8321]: info: Started.
Nov 20 12:11:22 db02 mgmtd: [8318]: info: init_crm
Nov 20 12:11:22 db02 cib: [8320]: WARN: readCibXmlFile: Primary configuration
corrupt or unusable, trying backup...
Nov 20 12:11:22 db02 cib: [8320]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.xml.last (digest:
/var/lib/heartbeat/crm/cib.xml.sig.last)
Nov 20 12:11:22 db02 cib: [8320]: ERROR: validate_cib_digest: Digest
comparision failed: expected 29fbc0aa4a0f21ba83a83739b1591f52
(/var/lib/heartbeat/crm/cib.xml.sig.last), calculated
e82a0fd028b6a545881f66ce5b6c1cd4
Nov 20 12:11:22 db02 cib: [8320]: ERROR: retrieveCib: Checksum of
/var/lib/heartbeat/crm/cib.xml.last failed! Configuration contents ignored!
Nov 20 12:11:22 db02 cib: [8320]: ERROR: retrieveCib: Usually this is caused by
manual changes, please refer to http://linux-ha.org/v2/faq/cib_changes_detected
Nov 20 12:11:22 db02 cib: [8320]: WARN: retrieveCib: Continuing but
/var/lib/heartbeat/crm/cib.xml.last will NOT used.
Nov 20 12:11:22 db02 cib: [8320]: WARN: readCibXmlFile: Continuing with an
empty configuration.
Nov 20 12:11:22 db02 cib: [8320]: info: startCib: CIB Initialization completed
successfully
Nov 20 12:11:22 db02 mgmtd: [8318]: info: login to cib: 0, ret:-10
Nov 20 12:11:22 db02 cib: [8320]: info: register_heartbeat_conn: Hostname: db02
Nov 20 12:11:22 db02 cib: [8320]: info: register_heartbeat_conn: UUID:
1ebdfff3-30a9-478f-b487-ca6e2122b27f
Nov 20 12:11:22 db02 cib: [8320]: info: ccm_connect: Registering with CCM...
Nov 20 12:11:22 db02 cib: [8320]: WARN: ccm_connect: CCM Activation failed
Nov 20 12:11:22 db02 cib: [8320]: WARN: ccm_connect: CCM Connection failed 1
times (30 max)
Nov 20 12:11:23 db02 crmd: [8324]: WARN: do_cib_control: Couldn't complete CIB
registration 1 times... pause and retry
Nov 20 12:11:23 db02 crmd: [8324]: info: crmd_init: Starting crmd's mainloop
Nov 20 12:11:23 db02 mgmtd: [8318]: info: login to cib: 1, ret:-10
Nov 20 12:11:24 db02 mgmtd: [8318]: info: login to cib: 2, ret:-10
Nov 20 12:11:25 db02 crmd: [8324]: info: crm_timer_popped: Wait Timer (I_NULL)
just popped!
Nov 20 12:11:25 db02 mgmtd: [8318]: info: login to cib: 3, ret:-10
Nov 20 12:11:25 db02 cib: [8320]: info: ccm_connect: Registering with CCM...
Nov 20 12:11:25 db02 cib: [8320]: WARN: ccm_connect: CCM Activation failed
Nov 20 12:11:25 db02 cib: [8320]: WARN: ccm_connect: CCM Connection failed 2
times (30 max)
Nov 20 12:11:26 db02 crmd: [8324]: WARN: do_cib_control: Couldn't complete CIB
registration 2 times... pause and retry
Nov 20 12:11:26 db02 mgmtd: [8318]: info: login to cib: 4, ret:-10
Nov 20 12:11:27 db02 mgmtd: [8318]: info: login to cib failed
Nov 20 12:11:27 db02 mgmtd: [8318]: ERROR: Can't initialize management
library.Shutting down.(-1)
Nov 20 12:11:27 db02 heartbeat: [8305]: WARN: Managed /usr/lib/heartbeat/mgmtd
-v process 8318 exited with return code 1.
Nov 20 12:11:27 db02 heartbeat: [8305]: ERROR: Respawning client
"/usr/lib/heartbeat/mgmtd -v":
Nov 20 12:11:27 db02 heartbeat: [8305]: info: Starting child client
"/usr/lib/heartbeat/mgmtd -v" (0,0)
Nov 20 12:11:27 db02 ccm: [8319]: info: G_main_add_SignalHandler: Added signal
handler for signal 15
Nov 20 12:11:28 db02 crmd: [8324]: info: crm_timer_popped: Wait Timer (I_NULL)
just popped!
Nov 20 12:11:28 db02 heartbeat: [8327]: info: Starting
"/usr/lib/heartbeat/mgmtd -v" as uid 0 gid 0 (pid 8327)
Nov 20 12:11:28 db02 mgmtd: [8327]: info: G_main_add_SignalHandler: Added
signal handler for signal 15
Nov 20 12:11:28 db02 mgmtd: [8327]: info: G_main_add_SignalHandler: Added
signal handler for signal 10
Nov 20 12:11:28 db02 mgmtd: [8327]: info: G_main_add_SignalHandler: Added
signal handler for signal 12
Nov 20 12:11:28 db02 mgmtd: [8327]: info: init_crm
Nov 20 12:11:28 db02 mgmtd: [8327]: info: login to cib: 0, ret:-10
Nov 20 12:11:28 db02 cib: [8320]: info: ccm_connect: Registering with CCM...
Nov 20 12:11:28 db02 cib: [8320]: info: cib_init: Requesting the list of
configured nodes
Nov 20 12:11:28 db02 cib: [8320]: info: cib_init: Starting cib mainloop
Nov 20 12:11:28 db02 cib: [8320]: info: cib_client_status_callback: Status
update: Client db02/cib now has status [join]
Nov 20 12:11:28 db02 cib: [8320]: info: crm_update_peer: Node 0 is now known as
db02
Nov 20 12:11:28 db02 cib: [8320]: info: crm_update_peer: New Node db02: id=0
state=unknown addr=(null) votes=-1 born=0 seen=0
proc=00000000000000000000000000000000
Nov 20 12:11:28 db02 cib: [8320]: info: crm_update_peer_proc: db02.cib is now
online
Nov 20 12:11:28 db02 cib: [8328]: info: write_cib_contents: Wrote version 0.0.0
of the CIB to disk (digest: 2d74315af1a0a5039ce32402d6cc5e4c)
Nov 20 12:11:28 db02 cib: [8328]: info: retrieveCib: Reading cluster
configuration from: /var/lib/heartbeat/crm/cib.xml (digest:
/var/lib/heartbeat/crm/cib.xml.sig)
Nov 20 12:11:29 db02 cib: [8320]: info: cib_common_callback_worker: Setting
cib_refresh_notify callbacks for 8324 (3dd6add6-555d-466f-92b6-9595c5215d90): on
Nov 20 12:11:29 db02 crmd: [8324]: info: do_cib_control: CIB connection
established
Nov 20 12:11:29 db02 cib: [8320]: info: cib_client_status_callback: Status
update: Client db02/cib now has status [online]
Nov 20 12:11:29 db02 crmd: [8324]: info: register_heartbeat_conn: Hostname: db02
Nov 20 12:11:29 db02 crmd: [8324]: info: register_heartbeat_conn: UUID:
1ebdfff3-30a9-478f-b487-ca6e2122b27f
Nov 20 12:11:29 db02 cib: [8320]: info: cib_common_callback_worker: Setting
cib_diff_notify callbacks for 8327 (f862740e-d805-4a1e-a038-24f0bf264407): on
Nov 20 12:11:29 db02 crmd: [8324]: info: do_ha_control: Connected to Heartbeat
Nov 20 12:11:29 db02 crmd: [8324]: info: do_ccm_control: CCM connection
established... waiting for first callback
Nov 20 12:11:29 db02 crmd: [8324]: info: do_started: Delaying start, CCM
(0000000000100000) not connected
Nov 20 12:11:29 db02 crmd: [8324]: notice: crmd_client_status_callback: Status
update: Client db02/crmd now has status [online] (DC=false)
Nov 20 12:11:29 db02 cib: [8320]: info: cib_client_status_callback: Status
update: Client web01/cib now has status [online]
Nov 20 12:11:29 db02 crmd: [8324]: info: crm_update_peer: Node 0 is now known
as db02
Nov 20 12:11:29 db02 crmd: [8324]: info: crm_update_peer: New Node db02: id=0
state=unknown addr=(null) votes=-1 born=0 seen=0
proc=00000000000000000000000000000000
Nov 20 12:11:29 db02 crmd: [8324]: info: crm_update_peer_proc: db02.crmd is now
online
Nov 20 12:11:29 db02 crmd: [8324]: info: crmd_client_status_callback: Not the DC
Nov 20 12:11:29 db02 crmd: [8324]: notice: crmd_client_status_callback: Status
update: Client db02/crmd now has status [online] (DC=false)
Nov 20 12:11:30 db02 crmd: [8324]: info: crmd_client_status_callback: Not the DC
Nov 20 12:11:30 db02 crmd: [8324]: notice: crmd_client_status_callback: Status
update: Client web01/crmd now has status [online] (DC=false)
Nov 20 12:11:30 db02 cib: [8320]: info: crm_update_peer: Node 0 is now known as
web01
Nov 20 12:11:30 db02 cib: [8320]: info: crm_update_peer: New Node web01: id=0
state=unknown addr=(null) votes=-1 born=0 seen=0
proc=00000000000000000000000000000000
Nov 20 12:11:30 db02 cib: [8320]: info: crm_update_peer_proc: web01.cib is now
online
Nov 20 12:11:30 db02 cib: [8320]: info: cib_client_status_callback: Status
update: Client db01/cib now has status [online]
Nov 20 12:11:30 db02 cib: [8320]: info: crm_update_peer: Node 0 is now known as
db01
Nov 20 12:11:30 db02 cib: [8320]: info: crm_update_peer: New Node db01: id=0
state=unknown addr=(null) votes=-1 born=0 seen=0
proc=00000000000000000000000000000000
Nov 20 12:11:30 db02 cib: [8320]: info: crm_update_peer_proc: db01.cib is now
online
Nov 20 12:11:30 db02 cib: [8320]: info: cib_client_status_callback: Status
update: Client web02/cib now has status [online]
Nov 20 12:11:30 db02 crmd: [8324]: info: crm_update_peer: Node 0 is now known
as web01
Nov 20 12:11:30 db02 crmd: [8324]: info: crm_update_peer: New Node web01: id=0
state=unknown addr=(null) votes=-1 born=0 seen=0
proc=00000000000000000000000000000000
Nov 20 12:11:30 db02 crmd: [8324]: info: crm_update_peer_proc: web01.crmd is
now online
Nov 20 12:11:30 db02 crmd: [8324]: info: crmd_client_status_callback: Not the DC
Nov 20 12:11:30 db02 crmd: [8324]: notice: crmd_client_status_callback: Status
update: Client db01/crmd now has status [online] (DC=false)
Nov 20 12:11:31 db02 cib: [8320]: info: crm_update_peer: Node 0 is now known as
web02
Nov 20 12:11:31 db02 cib: [8320]: info: crm_update_peer: New Node web02: id=0
state=unknown addr=(null) votes=-1 born=0 seen=0
proc=00000000000000000000000000000000
Nov 20 12:11:31 db02 cib: [8320]: info: crm_update_peer_proc: web02.cib is now
online
Nov 20 12:11:31 db02 crmd: [8324]: info: crm_update_peer: Node 0 is now known
as db01
Nov 20 12:11:31 db02 crmd: [8324]: info: crm_update_peer: New Node db01: id=0
state=unknown addr=(null) votes=-1 born=0 seen=0
proc=00000000000000000000000000000000
Nov 20 12:11:31 db02 crmd: [8324]: info: crm_update_peer_proc: db01.crmd is now
online
Nov 20 12:11:31 db02 crmd: [8324]: info: crmd_client_status_callback: Not the DC
Nov 20 12:11:31 db02 crmd: [8324]: notice: crmd_client_status_callback: Status
update: Client web02/crmd now has status [online] (DC=false)
Nov 20 12:11:31 db02 crmd: [8324]: info: crm_update_peer: Node 0 is now known
as web02
Nov 20 12:11:31 db02 crmd: [8324]: info: crm_update_peer: New Node web02: id=0
state=unknown addr=(null) votes=-1 born=0 seen=0
proc=00000000000000000000000000000000
Nov 20 12:11:31 db02 crmd: [8324]: info: crm_update_peer_proc: web02.crmd is
now online
Nov 20 12:11:31 db02 crmd: [8324]: info: crmd_client_status_callback: Not the DC
Nov 20 12:11:31 db02 crmd: [8324]: info: do_started: Delaying start, CCM
(0000000000100000) not connected
Nov 20 12:11:32 db02 crmd: [8324]: info: mem_handle_event: Got an event
OC_EV_MS_NEW_MEMBERSHIP from ccm
Nov 20 12:11:32 db02 crmd: [8324]: info: mem_handle_event: instance=4, nodes=4,
new=4, lost=0, n_idx=0, new_idx=0, old_idx=8
Nov 20 12:11:32 db02 crmd: [8324]: info: crmd_ccm_msg_callback: Quorum
(re)attained after event=NEW MEMBERSHIP (id=4)
Nov 20 12:11:32 db02 crmd: [8324]: info: ccm_event_detail: NEW MEMBERSHIP:
trans=4, nodes=4, new=4, lost=0 n_idx=0, new_idx=0, old_idx=8
Nov 20 12:11:32 db02 crmd: [8324]: info: ccm_event_detail: CURRENT: db01
[nodeid=0, born=1]
Nov 20 12:11:32 db02 crmd: [8324]: info: ccm_event_detail: CURRENT: web02
[nodeid=3, born=2]
Nov 20 12:11:32 db02 crmd: [8324]: info: ccm_event_detail: CURRENT: web01
[nodeid=2, born=3]
Nov 20 12:11:32 db02 crmd: [8324]: info: ccm_event_detail: CURRENT: db02
[nodeid=1, born=4]
Nov 20 12:11:32 db02 crmd: [8324]: info: ccm_event_detail: NEW: db01
[nodeid=0, born=1]
Nov 20 12:11:32 db02 crmd: [8324]: info: ccm_event_detail: NEW: web02
[nodeid=3, born=2]
Nov 20 12:11:32 db02 crmd: [8324]: info: ccm_event_detail: NEW: web01
[nodeid=2, born=3]
Nov 20 12:11:32 db02 crmd: [8324]: info: ccm_event_detail: NEW: db02
[nodeid=1, born=4]
Nov 20 12:11:32 db02 cib: [8320]: info: mem_handle_event: Got an event
OC_EV_MS_NEW_MEMBERSHIP from ccm
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer: Node db01: id=0
state=member (new) addr=(null) votes=-1 born=1 seen=4
proc=00000000000000000000000000000200
Nov 20 12:11:32 db02 cib: [8320]: info: mem_handle_event: instance=4, nodes=4,
new=4, lost=0, n_idx=0, new_idx=0, old_idx=8
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer_proc: db01.ais is now
online
Nov 20 12:11:32 db02 cib: [8320]: info: cib_ccm_msg_callback: Processing CCM
event=NEW MEMBERSHIP (id=4)
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer: Node web02 now has
id: 3
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer: Node db01: id=0
state=member (new) addr=(null) votes=-1 born=1 seen=4
proc=00000000000000000000000000000100
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer: Node web02: id=3
(new) state=member (new) addr=(null) votes=-1 born=2 seen=4
proc=00000000000000000000000000000200
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer_proc: db01.ais is now
online
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer_proc: web02.ais is now
online
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer: Node web02 now has id:
3
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer: Node web01 now has
id: 2
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer: Node web02: id=3 (new)
state=member (new) addr=(null) votes=-1 born=2 seen=4
proc=00000000000000000000000000000100
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer: Node web01: id=2
(new) state=member (new) addr=(null) votes=-1 born=3 seen=4
proc=00000000000000000000000000000200
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer_proc: web02.ais is now
online
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer_proc: web01.ais is now
online
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer: Node web01 now has id:
2
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer: Node db02 now has id:
1
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer: Node web01: id=2 (new)
state=member (new) addr=(null) votes=-1 born=3 seen=4
proc=00000000000000000000000000000100
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer: Node db02: id=1 (new)
state=member (new) addr=(null) votes=-1 born=4 seen=4
proc=00000000000000000000000000000200
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer_proc: web01.ais is now
online
Nov 20 12:11:32 db02 crmd: [8324]: info: crm_update_peer_proc: db02.ais is now
online
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer: Node db02 now has id: 1
Nov 20 12:11:32 db02 crmd: [8324]: info: do_started: The local CRM is
operational
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer: Node db02: id=1 (new)
state=member (new) addr=(null) votes=-1 born=4 seen=4
proc=00000000000000000000000000000100
Nov 20 12:11:32 db02 crmd: [8324]: info: do_state_transition: State transition
S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL
origin=do_started ]
Nov 20 12:11:32 db02 cib: [8320]: info: crm_update_peer_proc: db02.ais is now
online
Nov 20 12:11:32 db02 cib: [8320]: info: cib_process_request: Operation
complete: op cib_slave for section 'all' (origin=local/crmd/4): ok (rc=0)
Nov 20 12:11:33 db02 mgmtd: [8327]: info: Started.
Nov 20 12:11:34 db02 crmd: [8324]: info: update_dc: Set DC to db01 (2.0)
Nov 20 12:11:36 db02 crmd: [8324]: info: update_dc: Set DC to db01 (2.0)
Nov 20 12:11:36 db02 crmd: [8324]: info: do_state_transition: State transition
S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE
origin=do_cl_join_finalize_respond ]
Nov 20 12:11:36 db02 ccm: [8319]: info: client (pid=8320) removed from ccm
Nov 20 12:11:36 db02 crmd: [8324]: info: cib_native_msgready: Lost connection
to the CIB service [8320].
Nov 20 12:11:36 db02 crmd: [8324]: CRIT: cib_native_dispatch: Lost connection
to the CIB service [8320/callback].
Nov 20 12:11:36 db02 crmd: [8324]: CRIT: cib_native_dispatch: Lost connection
to the CIB service [8320/command].
Nov 20 12:11:36 db02 crmd: [8324]: ERROR: crmd_cib_connection_destroy:
Connection to the CIB terminated...
Nov 20 12:11:36 db02 crmd: [8324]: ERROR: do_log: FSA: Input I_ERROR from
crmd_cib_connection_destroy() received in state S_NOT_DC
Nov 20 12:11:36 db02 crmd: [8324]: info: do_state_transition: State transition
S_NOT_DC -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL
origin=crmd_cib_connection_destroy ]
Nov 20 12:11:36 db02 crmd: [8324]: ERROR: do_recover: Action A_RECOVER
(0000000001000000) not supported
Nov 20 12:11:36 db02 crmd: [8324]: ERROR: do_log: FSA: Input I_TERMINATE from
do_recover() received in state S_RECOVERY
Nov 20 12:11:36 db02 crmd: [8324]: info: do_state_transition: State transition
S_RECOVERY -> S_TERMINATE [ input=I_TERMINATE cause=C_FSA_INTERNAL
origin=do_recover ]
Nov 20 12:11:36 db02 crmd: [8324]: info: do_shutdown: All subsystems stopped,
continuing
Nov 20 12:11:36 db02 crmd: [8324]: info: do_lrm_control: Disconnected from the
LRM
Nov 20 12:11:36 db02 crmd: [8324]: info: do_ha_control: Disconnected from
Heartbeat
Nov 20 12:11:36 db02 ccm: [8319]: info: client (pid=8324) removed from ccm
Nov 20 12:11:36 db02 mgmtd: [8327]: CRIT: cib_native_dispatch: Lost connection
to the CIB service [8320/callback].
Nov 20 12:11:36 db02 heartbeat: [8305]: WARN: Managed /usr/lib/heartbeat/cib
process 8320 killed by signal 11 [SIGSEGV - Segmentation violation].
Nov 20 12:11:36 db02 crmd: [8324]: info: do_cib_control: Disconnecting CIB
Nov 20 12:11:36 db02 mgmtd: [8327]: CRIT: cib_native_dispatch: Lost connection
to the CIB service [8320/command].
Nov 20 12:11:36 db02 heartbeat: [8305]: ERROR: Managed /usr/lib/heartbeat/cib
process 8320 dumped core
Nov 20 12:11:36 db02 crmd: [8324]: info: do_exit: Performing A_EXIT_0 -
gracefully exiting the CRMd
Nov 20 12:11:36 db02 heartbeat: [8305]: EMERG: Rebooting system. Reason:
/usr/lib/heartbeat/cib
Nov 20 12:11:36 db02 crmd: [8324]: ERROR: do_exit: Could not recover from
internal error
Nov 20 12:11:36 db02 crmd: [8324]: info: free_mem: Dropping I_TERMINATE: [
state=S_TERMINATE cause=C_FSA_INTERNAL origin=do_stop ]
Nov 20 12:11:36 db02 crmd: [8324]: info: do_exit: [crmd] stopped (2)
Nov 20 12:11:37 db02 attrd: [8323]: ERROR: send_ipc_message: IPC Channel to
8320 is not connected
Nov 20 12:11:37 db02 attrd: [8323]: info: main: Starting mainloop...
Nov 20 12:11:37 db02 attrd: [8323]: info: cib_native_msgready: Lost connection
to the CIB service [8320].
Nov 20 12:11:37 db02 attrd: [8323]: CRIT: cib_native_dispatch: Lost connection
to the CIB service [8320/callback].
Nov 20 12:11:37 db02 attrd: [8323]: CRIT: cib_native_dispatch: Lost connection
to the CIB service [8320/command].
Nov 20 12:11:37 db02 kernel: md: stopping all md devices.
Nov 20 12:11:37 db02 attrd: [8323]: ERROR: attrd_cib_connection_destroy:
Connection to the CIB terminated...
Nov 20 12:12:02 db02 syslogd 1.4.1: restart.
[EMAIL PROTECTED] log]# gdb /usr/lib/heartbeat/crmd
/var/lib/heartbeat/cores/hacluster/core.8320
GNU gdb Red Hat Linux (6.5-37.el5_2.2rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db
library "/lib/libthread_db.so.1".
warning: core file may not match specified executable file.
Core was generated by `/usr/lib/heartbeat/cib'.
Program terminated with signal 11, Segmentation fault.
#0 0x00c18af2 in ?? ()
(gdb) where
#0 0x00c18af2 in ?? ()
#1 0x08487dd8 in ?? ()
#2 0x08482c60 in ?? ()
#3 0x57b43e00 in ?? ()
#4 0x00d77fbc in ?? ()
#5 0x08487dd8 in ?? ()
#6 0x00000000 in ?? ()
(gdb)
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems