Hello,

sometimes "heartbeat stop" seems to hang (latest packets from clusterlabs.org, RHEL5 x86_64, 2-node cluster with only one node running).

The last lines from ha-debug are like this:

Feb 22 12:52:48 dbprod21 ccm: [24053]: info: client (pid=24058) removed from ccm Feb 22 12:52:48 dbprod21 crmd: [24058]: info: do_ha_control: Disconnected from Heartbeat Feb 22 12:52:48 dbprod21 crmd: [24058]: info: do_cib_control: Disconnecting CIB Feb 22 12:52:48 dbprod21 cib: [24054]: info: cib_process_readwrite: We are now in R/O mode Feb 22 12:52:48 dbprod21 crmd: [24058]: info: crmd_cib_connection_destroy: Connection to the CIB terminated... Feb 22 12:52:48 dbprod21 cib: [24054]: WARN: send_ipc_message: IPC Channel to 24058 is not connected Feb 22 12:52:48 dbprod21 crmd: [24058]: info: do_exit: Performing A_EXIT_0 - gracefully exiting the CRMd Feb 22 12:52:48 dbprod21 cib: [24054]: WARN: send_via_callback_channel: Delivery of reply to client 24058/d9c9c281-4f38-46d8-b83e-54135f6c75e9 failed Feb 22 12:52:48 dbprod21 crmd: [24058]: info: free_mem: Dropping I_TERMINATE: [ state=S_STOPPING cause=C_FSA_INTERNAL origin=do_stop ] Feb 22 12:52:48 dbprod21 cib: [24054]: WARN: do_local_notify: A-Sync reply to crmd failed: reply failed
Feb 22 12:52:48 dbprod21 crmd: [24058]: info: do_exit: [crmd] stopped (0)
Feb 22 12:52:48 dbprod21 heartbeat: [24040]: info: killing /usr/lib64/heartbeat/attrd process group 24057 with signal 15

# ps -efw | grep heart

root 24040 1 0 12:49 ? 00:00:00 heartbeat: master control process
root     24043 24040  0 12:49 ?        00:00:00 heartbeat: FIFO reader
root     24044 24040  0 12:49 ?        00:00:00 heartbeat: write: ucast eth0
root     24045 24040  0 12:49 ?        00:00:00 heartbeat: read: ucast eth0
root     24046 24040  0 12:49 ?        00:00:00 heartbeat: write: ucast eth0
root     24047 24040  0 12:49 ?        00:00:00 heartbeat: read: ucast eth0
root 24048 24040 0 12:49 ? 00:00:00 heartbeat: write: serial /dev/ttyS0 root 24049 24040 0 12:49 ? 00:00:00 heartbeat: read: serial /dev/ttyS0
101      24053 24040  0 12:50 ?        00:00:00 /usr/lib64/heartbeat/ccm
101      24054 24040  0 12:50 ?        00:00:00 /usr/lib64/heartbeat/cib
root     24055 24040  0 12:50 ?        00:00:00 /usr/lib64/heartbeat/lrmd -r
root 24056 24040 0 12:50 ? 00:00:00 /usr/lib64/heartbeat/stonithd
101      24057 24040  0 12:50 ?        00:00:00 /usr/lib64/heartbeat/attrd
root 24366 22245 0 12:52 pts/2 00:00:00 /bin/sh /etc/init.d/heartbeat stop root 24377 24366 0 12:52 pts/2 00:00:00 heartbeat

What could be the problem leading to this behaviour? Of course it's possible to kill the processes manually but that's not what i really like...

Regards
Markus

_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Reply via email to