On Thu, Nov 20, 2008 at 02:48, Wolf Noble <[EMAIL PROTECTED]> wrote:
> Hi Gang.
>
> I have a 4 vm node cluster. (centos5)
> I, using the gui, paused one node (db02) in the cluster.
> All resources failed off of it.
>
> I then shutdown heartbeat,
>
> Chkconfig -levels 2345 heartbeat off
>
> yum remove heartbeat-stonith heartbeat-pils heartbeat heartbeat-gui openais
>
> wget
> http://download.opensuse.org/repositories/server:/ha-clustering/CentOS_5/server:ha-clustering.repo
> -O /etc/yum.repos.d/server_ha-clustering.repo
>
> yum --disablerepo base --enablerepo server_ha-clustering install openais
> heartbeat-common heartbeat-resources heartbeat pacemaker pacemaker-pygui
> libopenais2 libpacemaker3
>
> Reboot
>
> Then attempted to start heartbeat, whereupon the box would endlessly reboot
> itself thanks to CIB dumping core (hence the disabling heartbeat at boot)
>
> Any ideas how to rectify?
use "crm respawn" in ha.cf to prevent the rebooting
beyond that, i'd need a stack trace.
with gdb installed, run:
gdb /usr/lib/heartbeat/crmd /var/lib/heartbeat/cores/hacluster/core.2674
(or if core.2674 doesnt exist, gdb /usr/lib/heartbeat/crmd
/var/lib/heartbeat/cores/hacluster/core)
then at the gdb prompt, type: where
and reply with the output
> Nov 19 16:58:40 db02 cib: [2674]: info: cib_process_request: Operation
> complete: op cib_slave for section 'all' (origin=local/crmd/4): ok (rc=0)
> Nov 19 16:58:40 db02 mgmtd: [2681]: info: Started.
> Nov 19 16:58:42 db02 crmd: [2678]: info: update_dc: Set DC to db01 (2.0)
> Nov 19 16:58:44 db02 ccm: [2673]: info: client (pid=2674) removed from ccm
> Nov 19 16:58:44 db02 mgmtd: [2681]: CRIT: cib_native_dispatch: Lost
> connection to the CIB service [2674/callback].
> Nov 19 16:58:44 db02 mgmtd: [2681]: CRIT: cib_native_dispatch: Lost
> connection to the CIB service [2674/command].
> Nov 19 16:58:44 db02 heartbeat: [2659]: WARN: Managed /usr/lib/heartbeat/cib
> process 2674 killed by signal 11 [SIGSEGV - Segmentation violation].
> Nov 19 16:58:44 db02 heartbeat: [2659]: ERROR: Managed /usr/lib/heartbeat/cib
> process 2674 dumped core
> Nov 19 16:58:44 db02 heartbeat: [2659]: EMERG: Rebooting system. Reason:
> /usr/lib/heartbeat/cib
> Nov 19 16:58:44 db02 ccm: [2673]: info: client (pid=2678) removed from ccm
> Nov 19 16:58:46 db02 attrd: [2677]: ERROR: send_ipc_message: IPC Channel to
> 2674 is not connected
> Nov 19 16:58:46 db02 attrd: [2677]: info: main: Starting mainloop...
> Nov 19 16:58:46 db02 attrd: [2677]: info: cib_native_msgready: Lost
> connection to the CIB service [2674].
> Nov 19 16:58:46 db02 attrd: [2677]: CRIT: cib_native_dispatch: Lost
> connection to the CIB service [2674/callback].
> Nov 19 16:58:46 db02 attrd: [2677]: CRIT: cib_native_dispatch: Lost
> connection to the CIB service [2674/command].
> Nov 19 16:58:46 db02 attrd: [2677]: ERROR: attrd_cib_connection_destroy:
> Connection to the CIB terminated...
> Nov 19 16:58:47 db02 kernel: md: stopping all md devices.
> Nov 19 16:59:36 db02 syslogd 1.4.1: restart.
>
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems