[Pacemaker] Unable to configure Pacemaker with cibadmin
Hello, I am trying to update the configuration of my cluster through the cibadmin command, but the command always fails: cibadmin --replace --scope resources --xml-file r.xml Call cib_replace failed (-41): Remote node did not respond I was able to replace the initial blank configuration, but updating it doesn¹t seem to work. The cluster is functioning and running some of the resources. Some of the are down, but I don¹t think that should make a difference: Last updated: Fri Jul 22 18:33:03 2011 Stack: openais Current DC: poc-tst-rh4 - partition with quorum Version: 1.0.9-89bd754939df5150de7cd76835f98fe90851b677 2 Nodes configured, 2 expected votes 3 Resources configured. Online: [ poc-tst-rh4 poc-tst-rh4-2 ] Resource Group: mysql fs_mysql(ocf::heartbeat:Filesystem):Started poc-tst-rh4 mysqld(ocf::heartbeat:mysql):Stopped Master/Slave Set: ms_drbd_mysql Masters: [ poc-tst-rh4 ] Slaves: [ poc-tst-rh4-2 ] Clone Set: pingclone Started: [ poc-tst-rh4-2 poc-tst-rh4 ] Failed actions: mysqld_start_0 (node=poc-tst-rh4, call=26, rc=5, status=complete): not installed fs_mysql_start_0 (node=poc-tst-rh4-2, call=31, rc=5, status=complete): not installed If I try to use the crm command line, it rejects any configuration changes I make: crm configure edit ERROR: could not replace mysql INFO: offending xml: What could be causing the configuration to fail? Thank you for any assistance, Kelly Wong ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Sending message via cpg FAILED: (rc=12) Doesn't exist
22.07.2011 20:30, Steven Dake пишет: On 07/22/2011 01:15 AM, Proskurin Kirill wrote: Hello all. pacemaker-1.1.5 corosync-1.4.0 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee There is a problem? Does your retransmit list continually display e4 e5 etc for rest of cluster lifetime, or is this short lived? Yes it continually display this. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Sending message via cpg FAILED: (rc=12) Doesn't exist
On 07/22/2011 01:15 AM, Proskurin Kirill wrote: > Hello all. > > > pacemaker-1.1.5 > corosync-1.4.0 > > 4 nodes in cluster. 3 online 1 not. > In logs: > > Jul 22 11:50:23 my106.example.com crmd: [28030]: info: > pcmk_quorum_notification: Membership 0: quorum retained (0) > Jul 22 11:50:23 my106.example.com crmd: [28030]: info: do_started: > Delaying start, no membership data (0010) > Jul 22 11:50:23 my106.example.com crmd: [28030]: info: > config_query_callback: Shutdown escalation occurs after: 120ms > Jul 22 11:50:23 my106.example.com crmd: [28030]: info: > config_query_callback: Checking for expired actions every 90ms > Jul 22 11:50:23 my106.example.com crmd: [28030]: info: do_started: > Delaying start, no membership data (0010) > Jul 22 11:50:27 my106.example.com attrd: [28028]: info: cib_connect: > Connected to the CIB after 1 signon attempts > Jul 22 11:50:27 my106.example.com attrd: [28028]: info: cib_connect: > Sending full refresh > Jul 22 11:52:18 corosync [TOTEM ] A processor joined or left the > membership and a new membership was formed. > Jul 22 11:52:18 corosync [CPG ] chosen downlist: sender r(0) > ip(10.3.1.107) ; members(old:4 left:1) > Jul 22 11:52:18 corosync [MAIN ] Completed service synchronization, > ready to provide service. > Jul 22 11:52:19 my106.example.com pacemakerd: [28021]: ERROR: > send_cpg_message: Sending message via cpg FAILED: (rc=12) Doesn't exist > Jul 22 11:52:19 my106.example.com pacemakerd: [28021]: ERROR: > send_cpg_message: Sending message via cpg FAILED: (rc=12) Doesn't exist > Jul 22 11:52:19 my106.example.com pacemakerd: [28021]: ERROR: > send_cpg_message: Sending message via cpg FAILED: (rc=12) Doesn't exist > > > > DC: > > Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee > Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee > Jul 22 11:50:07 my107.example.com pacemakerd: [22388]: info: > update_node_processes: Node my106.example.com now has process list: > 0002 (was 00 > 12) > Jul 22 11:50:07 my107.example.com attrd: [22397]: info: crm_update_peer: > Node my106.example.com: id=0 state=unknown addr=(null) votes=0 born=0 > seen=0 proc=00 > 02 (new) > Jul 22 11:50:07 my107.example.com cib: [22395]: info: crm_update_peer: > Node my106.example.com: id=0 state=unknown addr=(null) votes=0 born=0 > seen=0 proc=0002 > (new) > Jul 22 11:50:07 my107.example.com stonith-ng: [22394]: info: > crm_update_peer: Node my106.example.com: id=0 state=unknown addr=(null) > votes=0 born=0 seen=0 proc=0 > 002 (new) > Jul 22 11:50:07 my107.example.com crmd: [22399]: info: crm_update_peer: > Node my106.example.com: id=0 state=unknown addr=(null) votes=0 born=0 > seen=0 proc=000 > 2 (new) > Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee > Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee > > > There is a problem? > Does your retransmit list continually display e4 e5 etc for rest of cluster lifetime, or is this short lived? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] Cluster type is: corosync
Hello again! Hope I`m not flooding too much here but I have another problem. I install same rpm of corosync, openais, pacemaker, cluster_glue on all nodes. I check it twice. And then I start some of they - they can`t connect to cluster and stays offline. In logs I see what they see other nodes and connectivity is ok. But I found the difference: Online nodes in cluster have: [root@mysender39 ~]# grep 'Cluster type is' /var/log/corosync.log Jul 22 20:38:58 mysender39.mail.ru stonith-ng: [3499]: info: get_cluster_type: Cluster type is: 'openais'. Jul 22 20:38:58 mysender39.mail.ru attrd: [3502]: info: get_cluster_type: Cluster type is: 'openais'. Jul 22 20:38:58 mysender39.mail.ru cib: [3500]: info: get_cluster_type: Cluster type is: 'openais'. Jul 22 20:38:59 mysender39.mail.ru crmd: [3504]: info: get_cluster_type: Cluster type is: 'openais'. Offline have: [root@mysender2 ~]# grep 'Cluster type is' /var/log/corosync.log Jul 22 13:39:17 mysender2.mail.ru stonith-ng: [9028]: info: get_cluster_type: Cluster type is: 'corosync'. Jul 22 13:39:17 mysender2.mail.ru attrd: [9031]: info: get_cluster_type: Cluster type is: 'corosync'. Jul 22 13:39:17 mysender2.mail.ru cib: [9029]: info: get_cluster_type: Cluster type is: 'corosync'. Jul 22 13:39:18 mysender2.mail.ru crmd: [9033]: info: get_cluster_type: Cluster type is: 'corosync'. What`s wrong and how can I fix it? -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] Sending message via cpg FAILED: (rc=12) Doesn't exist
Hello all. pacemaker-1.1.5 corosync-1.4.0 4 nodes in cluster. 3 online 1 not. In logs: Jul 22 11:50:23 my106.example.com crmd: [28030]: info: pcmk_quorum_notification: Membership 0: quorum retained (0) Jul 22 11:50:23 my106.example.com crmd: [28030]: info: do_started: Delaying start, no membership data (0010) Jul 22 11:50:23 my106.example.com crmd: [28030]: info: config_query_callback: Shutdown escalation occurs after: 120ms Jul 22 11:50:23 my106.example.com crmd: [28030]: info: config_query_callback: Checking for expired actions every 90ms Jul 22 11:50:23 my106.example.com crmd: [28030]: info: do_started: Delaying start, no membership data (0010) Jul 22 11:50:27 my106.example.com attrd: [28028]: info: cib_connect: Connected to the CIB after 1 signon attempts Jul 22 11:50:27 my106.example.com attrd: [28028]: info: cib_connect: Sending full refresh Jul 22 11:52:18 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed. Jul 22 11:52:18 corosync [CPG ] chosen downlist: sender r(0) ip(10.3.1.107) ; members(old:4 left:1) Jul 22 11:52:18 corosync [MAIN ] Completed service synchronization, ready to provide service. Jul 22 11:52:19 my106.example.com pacemakerd: [28021]: ERROR: send_cpg_message: Sending message via cpg FAILED: (rc=12) Doesn't exist Jul 22 11:52:19 my106.example.com pacemakerd: [28021]: ERROR: send_cpg_message: Sending message via cpg FAILED: (rc=12) Doesn't exist Jul 22 11:52:19 my106.example.com pacemakerd: [28021]: ERROR: send_cpg_message: Sending message via cpg FAILED: (rc=12) Doesn't exist DC: Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee Jul 22 11:50:07 my107.example.com pacemakerd: [22388]: info: update_node_processes: Node my106.example.com now has process list: 0002 (was 00 12) Jul 22 11:50:07 my107.example.com attrd: [22397]: info: crm_update_peer: Node my106.example.com: id=0 state=unknown addr=(null) votes=0 born=0 seen=0 proc=00 02 (new) Jul 22 11:50:07 my107.example.com cib: [22395]: info: crm_update_peer: Node my106.example.com: id=0 state=unknown addr=(null) votes=0 born=0 seen=0 proc=0002 (new) Jul 22 11:50:07 my107.example.com stonith-ng: [22394]: info: crm_update_peer: Node my106.example.com: id=0 state=unknown addr=(null) votes=0 born=0 seen=0 proc=0 002 (new) Jul 22 11:50:07 my107.example.com crmd: [22399]: info: crm_update_peer: Node my106.example.com: id=0 state=unknown addr=(null) votes=0 born=0 seen=0 proc=000 2 (new) Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee Jul 22 11:50:07 corosync [TOTEM ] Retransmit List: e4 e5 e7 e8 ea eb ed ee There is a problem? -- Best regards, Proskurin Kirill ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] Problem with colocation
Hello, I'm having a problem with colocation (namely that services end up on different nodes): Online: [ cluster1.intra cluster2.intra ] OFFLINE: [ cluster3.intra ] Sphinx_IP (ocf::heartbeat:IPaddr2): Started cluster1.intra Sphinx (lsb:sphinx): Started cluster2.intra As per request on irc, I've attached my cibadmin log. -- Taneli Leppä | CISSP, RHCE, ZCE, CMDEV Crasman Co Ltd | tan...@crasman.fi ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker