Re: [Pacemaker] Unable to configure Pacemaker with cibadmin
On Tue, Aug 2, 2011 at 11:12 AM, Kelly Wong wrote: > It waits for a pretty long time, at least 30 seconds. The weird thing is if > I use cibadmin —replace —xml-file cib.xml, it’s able to update the cib, but > if I try to replace a particular scope, it fails. Thats really weird. Can you grab me the logs from the DC node from about the time you ran the cibadmin command? > > Kelly Wong > TXBU - Cisco Systems > > > On 7/31/11 7:00 PM, "Andrew Beekhof" wrote: > > On Sat, Jul 23, 2011 at 11:57 AM, Kelly Wong wrote: >> Hello, >> >> I am trying to update the configuration of my cluster through the cibadmin >> command, but the command always fails: >> >> cibadmin --replace --scope resources --xml-file r.xml >> Call cib_replace failed (-41): Remote node did not respond > > That error is triggered by a timeout. How long does the command wait > before returning this error? > >> >> >> I was able to replace the initial blank configuration, but updating it >> doesn’t seem to work. The cluster is functioning and running some of the >> resources. Some of the are down, but I don’t think that should make a >> difference: >> >> >> Last updated: Fri Jul 22 18:33:03 2011 >> Stack: openais >> Current DC: poc-tst-rh4 - partition with quorum >> Version: 1.0.9-89bd754939df5150de7cd76835f98fe90851b677 >> 2 Nodes configured, 2 expected votes >> 3 Resources configured. >> >> >> Online: [ poc-tst-rh4 poc-tst-rh4-2 ] >> >> Resource Group: mysql >> fs_mysql (ocf::heartbeat:Filesystem): Started poc-tst-rh4 >> mysqld (ocf::heartbeat:mysql): Stopped >> Master/Slave Set: ms_drbd_mysql >> Masters: [ poc-tst-rh4 ] >> Slaves: [ poc-tst-rh4-2 ] >> Clone Set: pingclone >> Started: [ poc-tst-rh4-2 poc-tst-rh4 ] >> >> Failed actions: >> mysqld_start_0 (node=poc-tst-rh4, call=26, rc=5, status=complete): not >> installed >> fs_mysql_start_0 (node=poc-tst-rh4-2, call=31, rc=5, status=complete): >> not installed >> >> If I try to use the crm command line, it rejects any configuration changes >> I >> make: >> crm configure edit >> ERROR: could not replace mysql >> INFO: offending xml: >> >> > value="Started"/> >> >> > type="Filesystem"> >> >> > value="/dev/drbd0"/> >> > name="directory" value="/var/lib/mysql/"/> >> > value="ext3"/> >> >> >> > timeout="60"/> >> > timeout="60"/> >> >> >> >> >> > value="/usr/bin/mysqld_safe"/> >> > value="/var/lib/mysql/mysql.pid"/> >> >> >> > timeout="60"/> >> > timeout="240"/> >> > timeout="240"/> >> >> >> >> >> >> What could be causing the configuration to fail? >> >> Thank you for any assistance, >> Kelly Wong >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> >> > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Resources are not restarted on definition change after f59d7460bdde (devel)
On Wed, Aug 3, 2011 at 7:35 PM, Vladislav Bogdanov wrote: > 01.08.2011 02:05, Andrew Beekhof wrote: >> On Wed, Jul 27, 2011 at 11:46 AM, Andrew Beekhof wrote: >>> On Fri, Jul 1, 2011 at 4:59 PM, Andrew Beekhof wrote: Hmm. Interesting. I will investigate. >>> >>> This is an unfortunate side-effect of my history compression patch. >> >> Actually I'm mistaken on this. There should be enough information in >> the CIB to handle definition changes properly. >> Could you reproduce and include a hb_report please? > > Just returned from vacations. > > Does 885007a1795e address this issue? > Quite possibly. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Live demo of Pacemaker Cloud on Fedora: Friday August 5th at 8am PST
Steven, Are you planning on recording/taping it if I want to watch it later? Thanks, Bob From: Steven Dake To: pcmk-cl...@oss.clusterlabs.org Cc: aeolus-de...@lists.fedorahosted.org; Fedora Cloud SIG ; "open...@lists.linux-foundation.org" ; The Pacemaker cluster resource manager Sent: Wednesday, August 3, 2011 9:42 AM Subject: [Pacemaker] Live demo of Pacemaker Cloud on Fedora: Friday August 5th at 8am PST Extending a general invitation to the high availability communities and other cloud community contributors to participate in a live demo I am giving on Friday August 5th 8am PST (GMT-7). Demo portion of session is 15 minutes and will be provided first followed by more details of our approach to high availability. I will use elluminate to show the demo on my desktop machine. To make elluminate work, you will need icedtea-web installed on your system which is not typically installed by default. You will also need a conference # and bridge code. Please contact me offlist with your location and I'll provide you with a hopefully toll free conference # and bridge code. Elluminate link: https://sas.elluminate.com/m.jnlp?sid=819&password=M.13AB020AEBE358D265FD925A07335F Bridge Code: Please contact me off list with your location and I'll respond back with dial-in information. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] Talk about linux clusters
Hi, I have the pleasure to deliver a talk about linux clusters tomorrow in Berlin. The talk will be in German. For details please see: http://www.guug.de/lokal/berlin/ Please feel free to attend if you have time. Greetings, -- Dr. Michael Schwartzkopff Guardinistr. 63 81375 München Tel: (0163) 172 50 98 signature.asc Description: This is a digitally signed message part. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] Live demo of Pacemaker Cloud on Fedora: Friday August 5th at 8am PST
Extending a general invitation to the high availability communities and other cloud community contributors to participate in a live demo I am giving on Friday August 5th 8am PST (GMT-7). Demo portion of session is 15 minutes and will be provided first followed by more details of our approach to high availability. I will use elluminate to show the demo on my desktop machine. To make elluminate work, you will need icedtea-web installed on your system which is not typically installed by default. You will also need a conference # and bridge code. Please contact me offlist with your location and I'll provide you with a hopefully toll free conference # and bridge code. Elluminate link: https://sas.elluminate.com/m.jnlp?sid=819&password=M.13AB020AEBE358D265FD925A07335F Bridge Code: Please contact me off list with your location and I'll respond back with dial-in information. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Reload action and stop/start sequence questions
27.07.2011 05:25, Andrew Beekhof wrote: ... >> * Dependent resources should not be stopped/started for 'reload' action. >> Of course they are restarted if reload fails and stop/start is executed >> then. (I see that they are restarted now for reload of a resource they >> depend on, is it a bug?) > > More like a limitation. Which is a round-a-bout way of saying "really > hard to fix bug". > You're welcome to create a BZ for it though, maybe one day I'll figure > out how to resolve it. > >> * (wish) Resources should be migrated out of node (if they support live >> migration) for stop/start sequence of resource they depend on. > > Migration can only occur if a resource at the bottom (excluding any > clones) of the resource stack. > In order to migrate any colocation dependancies need to be running at > _both_ the old and the new locations. > > This can only be true for resources that depend on clones. Yep, I actually had clones in mind. > >> * (wish) Redefinition of clones should be handled in a way which allows >> dependent live-migratable resources to survive (if reload action for >> clone instance either is not supported or fails). > > This doesn't make sense. > If the definition of one clone changes, then they all change and there > is nowhere for dependant resources to migrate to. Yes, I understand your point. That's why I marked this as a wish. It would be a killer feature - serialization of clone instances restarts. > >> That is: dependent >> resources which support live migration are first tried to migrate out of >> one node, and are stopped if migration fails. Then clone instance is >> restarted on that node. Then the same procedure applies to next cluster >> node so resources may return back to a first node. >> >> If above (at least first three points) is right, then is it possible to >> get a set of previous instance parameters the same way new configuration >> is passed (env vars), or RA should save that information itself in advance? >> >> Best, >> Vladislav >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Resources are not restarted on definition change after f59d7460bdde (devel)
01.08.2011 02:05, Andrew Beekhof wrote: > On Wed, Jul 27, 2011 at 11:46 AM, Andrew Beekhof wrote: >> On Fri, Jul 1, 2011 at 4:59 PM, Andrew Beekhof wrote: >>> Hmm. Interesting. I will investigate. >> >> This is an unfortunate side-effect of my history compression patch. > > Actually I'm mistaken on this. There should be enough information in > the CIB to handle definition changes properly. > Could you reproduce and include a hb_report please? Just returned from vacations. Does 885007a1795e address this issue? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Backup ring is marked faulty
Hello, we have exactly the same issue! Same version of corosync (1.3.1), also running on SuSE Linux Enterprise Server 11 SP1 with HAE. Aug 01 15:45:18 corosync [TOTEM ] Received ringid(172.20.16.2:308) seq 6a Aug 01 15:45:18 corosync [TOTEM ] Received ringid(172.20.16.2:308) seq 63 Aug 01 15:45:18 corosync [TOTEM ] releasing messages up to and including 60 Aug 01 15:45:18 corosync [TOTEM ] releasing messages up to and including 6d Aug 01 15:45:18 corosync [TOTEM ] Marking seqid 162 ringid 1 interface 10.2.2.6 FAULTY - administrative intervention required. rksaph06:/var/log/cluster # corosync-cfgtool -s Printing ring status. Local node ID 101717164 RING ID 0 id = 172.20.16.6 status = ring 0 active with no faults RING ID 1 id = 10.2.2.6 status = Marking seqid 162 ringid 1 interface 10.2.2.6 FAULTY - administrative intervention required. rrp_mode is set to "passive" Ring 0 (172.20.16.0) supports 1GB and ring 1 (10.2.2.0) supports 100 MBit. There was no other network traffic on ring 1 - only corosync (!) After re-activating both rings with "corosync-cfgtool -r" the problem is reproducable by simply connecting a crm_gui and hitting "refresh" inside the GUI 3-5 times. After that ring 1 (10.2.2.0) will be marked as "faulty" again. Thanks and best regards, -Martin Tegtmeier -Ursprüngliche Nachricht- Von: Sebastian Kaps [mailto:sebastian.k...@imail.de] Gesendet: Mi 03.08.2011 08:53 An: The Pacemaker cluster resource manager Betreff: Re: [Pacemaker] Backup ring is marked faulty Hi Steven! On Tue, 02 Aug 2011 17:45:46 -0700, Steven Dake wrote: > Which version of corosync? # corosync -v Corosync Cluster Engine, version '1.3.1' Copyright (c) 2006-2009 Red Hat, Inc. It's the version that comes with SLES11-SP1-HA. -- Sebastian ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker <>___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Backup ring is marked faulty
Hi Steven! On Tue, 02 Aug 2011 17:45:46 -0700, Steven Dake wrote: Which version of corosync? # corosync -v Corosync Cluster Engine, version '1.3.1' Copyright (c) 2006-2009 Red Hat, Inc. It's the version that comes with SLES11-SP1-HA. -- Sebastian ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker