Hi, I'm having problems getting CRM to start. If I run the cluster config in v1.x mode, it works OK. If I run it in v2 mode, I have issues. I was originally using a unicast and couldn't get it to start at all. I have since moved to broadcast, and it will sort of start up, but I get lots of these:
Jul 21 13:56:14 ps0kpr last message repeated 69 times Jul 21 13:56:14 ps0kpr crmd: [12502]: ERROR: cl_log: 35 messages were dropped Jul 21 13:56:14 ps0kpr cib: [12528]: WARN: send queue maximum length(500) exceeded Jul 21 13:56:14 ps0kpr last message repeated 103 times Jul 21 13:56:14 ps0kpr cib: [12498]: ERROR: cl_log: 237 messages were dropped Jul 21 13:56:14 ps0kpr cib: [12528]: WARN: send queue maximum length(500) exceeded Jul 21 13:56:14 ps0kpr last message repeated 37 times Jul 21 13:56:14 ps0kpr crmd: [12502]: ERROR: cl_log: 117 messages were dropped Jul 21 13:56:14 ps0kpr cib: [12528]: WARN: send queue maximum length(500) exceeded Jul 21 13:56:14 ps0kpr last message repeated 79 times Jul 21 13:56:14 ps0kpr heartbeat: [12488]: ERROR: cl_log: 14 messages were dropped Jul 21 13:56:14 ps0kpr cib: [12528]: WARN: send queue maximum length(500) exceeded For a little while, crm_mon reports both hosts as OFFLINE with no DC (even though both are running heartbeat) but eventually it hangs. After some time there will be some logs indicating issues talking to a CRM client, which I believe are related to these. [EMAIL PROTECTED] crm]# crm_mon Defaulting to one-shot mode You need to have curses available at compile time to enable console mode ============ Last updated: Mon Jul 21 13:54:38 2008 Current DC: NONE 2 Nodes configured. 0 Resources configured. ============ Node: ps1kpr (6e9462ba-7465-411c-bcb4-10baf68dffc3): OFFLINE Node: ps0kpr (9cf680e5-a2db-4d3d-9c6f-1ca4da51eb9d): OFFLINE Here's my ha.cf: keepalive 2 deadtime 16 warntime 10 initdead 60 udpport 694 bcast eth0 # Linux auto_failback on node ps0kpr ps1kpr debug 9 use_logd yes crm yes And the logs Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: Enabling logging daemon Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: logfile and debug file are those specified in logd config file (d efault /etc/logd.cf) Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: Version 2 support: yes Jul 21 13:53:22 ps0kpr heartbeat: [12487]: WARN: File /etc/ha.d/haresources exists. Jul 21 13:53:22 ps0kpr heartbeat: [12487]: WARN: This file is not used because crm is enabled Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: hacluster /usr/lib/heartbeat/ccm Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: hacluster /usr/lib/heartbeat/cib Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: root /usr/lib/heartbeat/lrmd -r Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: root /usr/lib/heartbeat/stonithd Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: hacluster /usr/lib/heartbeat/attrd Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: hacluster /usr/lib/heartbeat/crmd Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: root /usr/lib/heartbeat/mgmtd -v Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: AUTH: i=1: key = 0x8e6b750, auth=0x195228, authname=sha1 Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: ************************** Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: Configuration validated. Starting heartbeat 2.1.3 Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: heartbeat: version 2.1.3 Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: Heartbeat generation: 19 Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0 Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1 Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: G_main_add_TriggerHandler: Added signal manual handler Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: G_main_add_TriggerHandler: Added signal manual handler Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Jul 21 13:53:23 ps0kpr heartbeat: [12488]: info: Local status now set to: 'up' Jul 21 13:53:23 ps0kpr heartbeat: [12488]: info: Managed write_hostcachedata process 12494 exited with return code 0. Jul 21 13:53:23 ps0kpr heartbeat: [12488]: info: Link ps0kpr:eth0 up. Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Link ps1kpr:eth0 up. Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Status update for node ps1kpr: status up Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Comm_now_up(): updating status to active Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:33 ps0kpr last message repeated 16 times Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Local status now set to: 'active' Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client "/usr/lib/heartbeat/ccm" (1001,104) Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:33 ps0kpr last message repeated 52 times Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client "/usr/lib/heartbeat/cib" (1001,104) Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:33 ps0kpr last message repeated 52 times Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client "/usr/lib/heartbeat/lrmd -r" (0,0) Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:33 ps0kpr last message repeated 110 times Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client "/usr/lib/heartbeat/stonithd" (0,0) Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:33 ps0kpr last message repeated 11 times Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client "/usr/lib/heartbeat/attrd" (1001,104) Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr last message repeated 59 times Jul 21 13:53:34 ps0kpr heartbeat: [12488]: info: Starting child client "/usr/lib/heartbeat/crmd" (1001,104) Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr last message repeated 62 times Jul 21 13:53:34 ps0kpr heartbeat: [12488]: info: Starting child client "/usr/lib/heartbeat/mgmtd -v" (0,0) Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr last message repeated 36 times Jul 21 13:53:34 ps0kpr heartbeat: [12501]: info: Starting "/usr/lib/heartbeat/attrd" as uid 1001 gid 104 (pid 125 01) Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr last message repeated 39 times Jul 21 13:53:34 ps0kpr heartbeat: [12488]: info: Status update for node ps1kpr: status active Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr last message repeated 56 times Jul 21 13:53:34 ps0kpr stonithd: [12500]: info: G_main_add_SignalHandler: Added signal handler for signal 10 Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr stonithd: [12500]: info: G_main_add_SignalHandler: Added signal handler for signal 12 Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr last message repeated 32 times Jul 21 13:53:34 ps0kpr stonithd: [12500]: info: Signing in with heartbeat. Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr lrmd: [12499]: info: G_main_add_SignalHandler: Added signal handler for signal 15 Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr mgmtd: [12503]: info: G_main_add_SignalHandler: Added signal handler for signal 15 Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr last message repeated 2 times Jul 21 13:53:34 ps0kpr attrd: [12501]: info: G_main_add_SignalHandler: Added signal handler for signal 15 Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr crmd: [12502]: info: main: CRM Hg Version: node: 552305612591183b1628baa5bc6e903e0f1e26a3 Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr cib: [12498]: info: G_main_add_SignalHandler: Added signal handler for signal 15 Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr last message repeated 4 times Jul 21 13:53:34 ps0kpr lrmd: [12499]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:35 ps0kpr ccm: [12497]: info: Hostname: ps0kpr Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:35 ps0kpr cib: [12498]: info: G_main_add_TriggerHandler: Added signal manual handler Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:35 ps0kpr mgmtd: [12503]: info: G_main_add_SignalHandler: Added signal handler for signal 10 Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:35 ps0kpr last message repeated 6 times Jul 21 13:53:35 ps0kpr cib: [12498]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:35 ps0kpr mgmtd: [12503]: info: G_main_add_SignalHandler: Added signal handler for signal 12 Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:35 ps0kpr last message repeated 3 times Jul 21 13:53:35 ps0kpr lrmd: [12499]: info: G_main_add_SignalHandler: Added signal handler for signal 10 Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:35 ps0kpr last message repeated 2 times Jul 21 13:53:35 ps0kpr cib: [12498]: info: main: Retrieval of a per-action CIB: disabled Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:35 ps0kpr last message repeated 4 times Jul 21 13:53:35 ps0kpr lrmd: [12499]: info: G_main_add_SignalHandler: Added signal handler for signal 12 Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) exceeded Jul 21 13:53:35 ps0kpr last message repeated 2 times Jul 21 13:53:35 ps0kpr cib: [12498]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm /cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig) Any ideas? Cheers Simon ________________________________ Disclaimer: The e-mail and any files transmitted with it are confidential and may contain privileged or copyright information. If you are not the intended recipient you must not copy, distribute, or use this e-mail or the information contained in it for any purpose other than to notify us of the error. If you have received this message in error, please notify the sender immediately, by email or phone (+64 9 919 7000) and delete this email from your system. Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of NZ Communications Ltd. We do not guarantee that this material is free from viruses or any other defects although due care has been taken to minimize the risk. _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems