Hi,

I'm having problems getting CRM to start. If I run the cluster config in v1.x 
mode, it works OK. If I run it in v2 mode, I have issues. I was originally 
using a unicast and couldn't get it to start at all. I have since moved to 
broadcast, and it will sort of start up, but I get lots of these:

Jul 21 13:56:14 ps0kpr last message repeated 69 times
Jul 21 13:56:14 ps0kpr crmd: [12502]: ERROR: cl_log: 35 messages were dropped
Jul 21 13:56:14 ps0kpr cib: [12528]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:56:14 ps0kpr last message repeated 103 times
Jul 21 13:56:14 ps0kpr cib: [12498]: ERROR: cl_log: 237 messages were dropped
Jul 21 13:56:14 ps0kpr cib: [12528]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:56:14 ps0kpr last message repeated 37 times
Jul 21 13:56:14 ps0kpr crmd: [12502]: ERROR: cl_log: 117 messages were dropped
Jul 21 13:56:14 ps0kpr cib: [12528]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:56:14 ps0kpr last message repeated 79 times
Jul 21 13:56:14 ps0kpr heartbeat: [12488]: ERROR: cl_log: 14 messages were 
dropped
Jul 21 13:56:14 ps0kpr cib: [12528]: WARN: send queue maximum length(500) 
exceeded

For a little while, crm_mon reports both hosts as OFFLINE with no DC (even 
though both are running heartbeat) but eventually it hangs. After some time 
there will be some logs indicating issues talking to a CRM client, which I 
believe are related to these.

[EMAIL PROTECTED] crm]# crm_mon
Defaulting to one-shot mode
You need to have curses available at compile time to enable console mode


============
Last updated: Mon Jul 21 13:54:38 2008
Current DC: NONE
2 Nodes configured.
0 Resources configured.
============

Node: ps1kpr (6e9462ba-7465-411c-bcb4-10baf68dffc3): OFFLINE
Node: ps0kpr (9cf680e5-a2db-4d3d-9c6f-1ca4da51eb9d): OFFLINE


Here's my ha.cf:

keepalive 2
deadtime 16
warntime 10
initdead 60
udpport 694
bcast   eth0            # Linux
auto_failback on
node    ps0kpr ps1kpr

debug 9


use_logd yes
crm yes

And the logs

Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: Enabling logging daemon
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: logfile and debug file are 
those specified in logd config file (d
efault /etc/logd.cf)
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: Version 2 support: yes
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: WARN: File /etc/ha.d/haresources 
exists.
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: WARN: This file is not used because 
crm is enabled
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive:  hacluster 
/usr/lib/heartbeat/ccm
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive:  hacluster 
/usr/lib/heartbeat/cib
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: root 
/usr/lib/heartbeat/lrmd -r
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: root 
/usr/lib/heartbeat/stonithd
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive:  hacluster 
/usr/lib/heartbeat/attrd
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive:  hacluster 
/usr/lib/heartbeat/crmd
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: respawn directive: root 
/usr/lib/heartbeat/mgmtd -v
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: AUTH: i=1: key = 0x8e6b750, 
auth=0x195228, authname=sha1
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: **************************
Jul 21 13:53:22 ps0kpr heartbeat: [12487]: info: Configuration validated. 
Starting heartbeat 2.1.3
Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: heartbeat: version 2.1.3
Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: Heartbeat generation: 19
Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: glib: UDP Broadcast heartbeat 
started on port 694 (694) interface
 eth0
Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: glib: UDP Broadcast heartbeat 
closed on port 694 interface eth0 -
 Status: 1
Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: G_main_add_TriggerHandler: 
Added signal manual handler
Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: G_main_add_TriggerHandler: 
Added signal manual handler
Jul 21 13:53:22 ps0kpr heartbeat: [12488]: info: G_main_add_SignalHandler: 
Added signal handler for signal 17
Jul 21 13:53:23 ps0kpr heartbeat: [12488]: info: Local status now set to: 'up'
Jul 21 13:53:23 ps0kpr heartbeat: [12488]: info: Managed write_hostcachedata 
process 12494 exited with return code
 0.
Jul 21 13:53:23 ps0kpr heartbeat: [12488]: info: Link ps0kpr:eth0 up.
Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Link ps1kpr:eth0 up.
Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Status update for node ps1kpr: 
status up
Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Comm_now_up(): updating status 
to active
Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:33 ps0kpr last message repeated 16 times
Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Local status now set to: 
'active'
Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client 
"/usr/lib/heartbeat/ccm" (1001,104)
Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:33 ps0kpr last message repeated 52 times
Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client 
"/usr/lib/heartbeat/cib" (1001,104)
Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:33 ps0kpr last message repeated 52 times
Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client 
"/usr/lib/heartbeat/lrmd -r" (0,0)
Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:33 ps0kpr last message repeated 110 times
Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client 
"/usr/lib/heartbeat/stonithd" (0,0)
Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:33 ps0kpr last message repeated 11 times
Jul 21 13:53:33 ps0kpr heartbeat: [12488]: info: Starting child client 
"/usr/lib/heartbeat/attrd" (1001,104)
Jul 21 13:53:33 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr last message repeated 59 times
Jul 21 13:53:34 ps0kpr heartbeat: [12488]: info: Starting child client 
"/usr/lib/heartbeat/crmd" (1001,104)
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr last message repeated 62 times
Jul 21 13:53:34 ps0kpr heartbeat: [12488]: info: Starting child client 
"/usr/lib/heartbeat/mgmtd -v" (0,0)
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr last message repeated 36 times
Jul 21 13:53:34 ps0kpr heartbeat: [12501]: info: Starting 
"/usr/lib/heartbeat/attrd" as uid 1001  gid 104 (pid 125
01)
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr last message repeated 39 times
Jul 21 13:53:34 ps0kpr heartbeat: [12488]: info: Status update for node ps1kpr: 
status active
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr last message repeated 56 times
Jul 21 13:53:34 ps0kpr stonithd: [12500]: info: G_main_add_SignalHandler: Added 
signal handler for signal 10
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr stonithd: [12500]: info: G_main_add_SignalHandler: Added 
signal handler for signal 12
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr last message repeated 32 times
Jul 21 13:53:34 ps0kpr stonithd: [12500]: info: Signing in with heartbeat.
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr lrmd: [12499]: info: G_main_add_SignalHandler: Added 
signal handler for signal 15
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr mgmtd: [12503]: info: G_main_add_SignalHandler: Added 
signal handler for signal 15
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr last message repeated 2 times
Jul 21 13:53:34 ps0kpr attrd: [12501]: info: G_main_add_SignalHandler: Added 
signal handler for signal 15
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr crmd: [12502]: info: main: CRM Hg Version: node: 
552305612591183b1628baa5bc6e903e0f1e26a3
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr cib: [12498]: info: G_main_add_SignalHandler: Added 
signal handler for signal 15
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr last message repeated 4 times
Jul 21 13:53:34 ps0kpr lrmd: [12499]: info: G_main_add_SignalHandler: Added 
signal handler for signal 17
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:34 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:35 ps0kpr ccm: [12497]: info: Hostname: ps0kpr
Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:35 ps0kpr cib: [12498]: info: G_main_add_TriggerHandler: Added 
signal manual handler
Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:35 ps0kpr mgmtd: [12503]: info: G_main_add_SignalHandler: Added 
signal handler for signal 10
Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:35 ps0kpr last message repeated 6 times
Jul 21 13:53:35 ps0kpr cib: [12498]: info: G_main_add_SignalHandler: Added 
signal handler for signal 17
Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:35 ps0kpr mgmtd: [12503]: info: G_main_add_SignalHandler: Added 
signal handler for signal 12
Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:35 ps0kpr last message repeated 3 times
Jul 21 13:53:35 ps0kpr lrmd: [12499]: info: G_main_add_SignalHandler: Added 
signal handler for signal 10
Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:35 ps0kpr last message repeated 2 times
Jul 21 13:53:35 ps0kpr cib: [12498]: info: main: Retrieval of a per-action CIB: 
disabled
Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:35 ps0kpr last message repeated 4 times
Jul 21 13:53:35 ps0kpr lrmd: [12499]: info: G_main_add_SignalHandler: Added 
signal handler for signal 12
Jul 21 13:53:35 ps0kpr cib: [12498]: WARN: send queue maximum length(500) 
exceeded
Jul 21 13:53:35 ps0kpr last message repeated 2 times
Jul 21 13:53:35 ps0kpr cib: [12498]: info: retrieveCib: Reading cluster 
configuration from: /var/lib/heartbeat/crm
/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig)

Any ideas?

Cheers
Simon


________________________________
Disclaimer:

The e-mail and any files transmitted with it are confidential and may contain 
privileged or copyright information. If you are not the intended recipient you 
must not copy, distribute, or use this e-mail or the information contained in 
it for any purpose other than to notify us of the error. If you have received 
this message in error, please notify the sender immediately, by email or phone 
(+64 9 919 7000) and delete this email from your system. Any views expressed in 
this message are those of the individual sender, except where the sender 
specifically states them to be the views of NZ Communications Ltd. We do not 
guarantee that this material is free from viruses or any other defects although 
due care has been taken to minimize the risk.

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to