Re: [Pacemaker] Same host displayed twice in crm status
I've already tried to remove the node following the document: http://clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-node-delete.html with the following commands: crm_node -R *COROSYNC_ID* cibadmin --delete --obj_type nodes --crm_xml 'node uname= VMTESTORADG2.it.dbi-services.com/' cibadmin --delete --obj_type status --crm_xml 'node_state uname= VMTESTORADG2.it.dbi-services.com/' I also tried to delete the status before deleting the node with the same results. I have the same issue, the node is deleted but when corosync is restarted on any node, the deleted name appears again. The server with the changed hostname has been restarted so I don't know where the reference to the old name can be saved except in the cluster. Regarding the version, here are the details: - Corosync 1.2.7-1.1.el5 - Pacemaker 1.1.5-1.1.el5 2013/4/1 David Vossel dvos...@redhat.com - Original Message - From: Nicolas J. nikkro70+pacema...@gmail.com To: pacemaker@oss.clusterlabs.org Sent: Friday, March 29, 2013 8:55:30 AM Subject: [Pacemaker] Same host displayed twice in crm status Hi, I have a problem with a Corosync/Pacemaker configuration. One host of the cluster has been renamed and now the host is displayed twice in the configuration. When I try to remove the host from the configuration it works but if corosync is restarted on one node, the old host appears again. I tried several ways to delete the host with no effect. How can I delete the wrong host? For the pacemaker version you are using, try deleting the node from the configuration in both the node and status sections, then use crm_node -R option to remove the node from the cluster's internal cache. In pacemaker versions = 1.1.8 only the crm_node -R option is required to remove a node. -- Vossel I checked the Linux configuration and there is no place where the old name is referenced. It's an OEL/Red Hat linux. Output - [root@vmtestoradg2 ~]# crm status Last updated: Fri Mar 29 14:51:56 2013 Stack: openais Current DC: vmtestoradg1 - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 4 Nodes configured, 3 expected votes 1 Resources configured. Online: [ vmtestoradg1 vmtestora10g01 vmtestoradg2 ] OFFLINE: [ VMTESTORADG2.it.dbi-services.com ] DG_IP (ocf::heartbeat:IPaddr2): Started vmtestoradg1 [root@vmtestoradg2 ~]# crm node clearstate VMTESTORADG2.it.dbi-services.com Do you really want to drop state for node VMTESTORADG2.it.dbi-services.com ? y [root@vmtestoradg2 ~]# crm node delete VMTESTORADG2.it.dbi-services.com INFO: node VMTESTORADG2.it.dbi-services.com not found by crm_node INFO: node VMTESTORADG2.it.dbi-services.com deleted Thanks in advance Best Regards, Nicolas J. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] [PATCH] Use correct OCF_ROOT_DIR in include/crm/services.h.
Previously, libcrmservice always has OCF_ROOT_DIR defined as /usr/lib/ocf, despite the fact that another path was defined in glue_config.h. Caught on SunOS 5.11 while configuring cluster-glue and pacemaker using non-standard prefix. --- lib/services/Makefile.am |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/services/Makefile.am b/lib/services/Makefile.am index 3ee3347..8d44dad 100644 --- a/lib/services/Makefile.am +++ b/lib/services/Makefile.am @@ -25,7 +25,7 @@ noinst_HEADERS = upstart.h systemd.h services_private.h libcrmservice_la_SOURCES = services.c services_linux.c libcrmservice_la_LDFLAGS = -version-info 1:0:0 -libcrmservice_la_CFLAGS = $(GIO_CFLAGS) +libcrmservice_la_CFLAGS = -DOCF_ROOT_DIR=\@OCF_ROOT_DIR@\ $(GIO_CFLAGS) libcrmservice_la_LIBADD = $(GIO_LIBS) if BUILD_UPSTART -- Andrei ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [Problem][crmsh]The designation of the 'ordered' attribute becomes the error.
Hi, On Mon, Apr 01, 2013 at 09:19:51PM +0200, Andreas Kurz wrote: Hi Dejan, On 2013-03-06 11:59, Dejan Muhamedagic wrote: Hi Hideo-san, On Wed, Mar 06, 2013 at 10:37:44AM +0900, renayama19661...@ybb.ne.jp wrote: Hi Dejan, Hi Andrew, As for the crm shell, the check of the meta attribute was revised with the next patch. * http://hg.savannah.gnu.org/hgweb/crmsh/rev/d1174f42f4b3 This patch was backported in Pacemaker1.0.13. * https://github.com/ClusterLabs/pacemaker-1.0/commit/fa1a99ab36e0ed015f1bcbbb28f7db962a9d1abc#shell/modules/cibconfig.py However, the ordered,colocated attribute of the group resource is treated as an error when I use crm Shell which adopted this patch. -- (snip) ### Group Configuration ### group master-group \ vip-master \ vip-rep \ meta \ ordered=false (snip) [root@rh63-heartbeat1 ~]# crm configure load update test2339.crm INFO: building help index crm_verify[20028]: 2013/03/06_17:57:18 WARN: unpack_nodes: Blind faith: not fencing unseen nodes WARNING: vip-master: specified timeout 60s for start is smaller than the advised 90 WARNING: vip-master: specified timeout 60s for stop is smaller than the advised 100 WARNING: vip-rep: specified timeout 60s for start is smaller than the advised 90 WARNING: vip-rep: specified timeout 60s for stop is smaller than the advised 100 ERROR: master-group: attribute ordered does not exist - WHY? Do you still want to commit? y -- If it chooses `yes` by a confirmation message, it is reflected, but it is a problem that error message is displayed. * The error occurs in the same way when I appoint colocated attribute. AndI noticed that there was not explanation of ordered,colocated of the group resource in online help of Pacemaker. I think that the designation of the ordered,colocated attribute should not become the error in group resource. In addition, I think that ordered,colocated should be added to online help. These attributes are not listed in crmsh. Does the attached patch help? Dejan, will this patch for the missing ordered and collocated group meta-attribute be included in the next crmsh release? ... can't see the patch in the current tip. The shell in pacemaker v1.0.x is in maintenance mode and shipped along with the pacemaker code. The v1.1.x doesn't have the ordered and collocated meta attributes. Thanks, Dejan Thanks Regards, Andreas Thanks, Dejan Best Regards, Hideo Yamauchi. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] resource failover in active/active cluster
Hello guys, I am running corosync 1.4.2 and pacemaker 1.1.7 on Debian environments trying to deploy an active/active cluster with two nodes. The main problem is that I have two resources that depend on each other, so I have the VIP cloned over the two node and nginx daemon which depends on VIP, but when nginx goes down the VIP onde the failed node is not removed from cluster and that node still answer to the requests. Follow my current configuration: node host01 node host02 primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=192.168.2.100 cidr_netmask=24 nic=eth0 clusterip_hash=sourceip \ op monitor interval=1s primitive WebSite lsb:nginx \ op monitor interval=1s clone WebIP ClusterIP \ meta globally-unique=true clone-max=2 clone-node-max=2 clone WebSiteClone WebSite colocation website-with-ip inf: WebSiteClone WebIP order nginx-after-ip inf: WebIP WebSiteClone property $id=cib-bootstrap-options \ dc-version=1.1.7-ee0730e13d124c3d58f00016c3376a1de5323cff \ cluster-infrastructure=openais \ expected-quorum-votes=2 \ stonith-enabled=false \ last-lrm-refresh=1364509824 I have followed the pacemaker doc( http://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/pdf/Clusters_from_Scratch/Pacemaker-1.1-Clusters_from_Scratch-en-US.pdf) but I haven't used the distributed file system because, in this case, nginx is not sharing the configuration files and is playing like a load balancer. Can you tell me how to link the resources ? Thank you ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Speeding up startup after migration
- Original Message - From: Lars Marowsky-Bree l...@suse.com To: pacemaker@oss.clusterlabs.org Sent: Monday, April 1, 2013 5:21:53 PM Subject: Re: [Pacemaker] Speeding up startup after migration On 2013-04-01T13:09:14, David Vossel dvos...@redhat.com wrote: So, if I understand correctly, new lrmd runs as many simultaneous jobs as possible. Unfortunately, in some circumstances this would result in the high node load and timeouts. Is there a way to some-how limit that load? Isn't that what the batch-limit option does? or are you saying you want a batch limit type option that is node specific? Why are you concerned about this behavior living in the LRMD instead of at the transition processing level? I believe if we do any batch limiting type behavior at the LRMD level we're going to run into problems with the transition timers in the crmd. The LRMD needs to always perform the actions it is given as soon as possible. Seriously, folks, the LRM rewrite may turn out not to be the best example of pacemaker's attention to detail ;-) such is any re-write of poorly designed code ;-) --- I included the smiley so my jab is acceptable and not in poor taste just like yours! :D --- I included this smiley because I think it looks funny. Yes, the previous LRM had a per-node concurrency limit. This avoided overloading the nodes via IO, which is why it was added. (And also smoothed out spikes in the monitoring calls should they happen to coincide.) Default limit of parallel executions was 4 or half the number of CPU cores, if memory serves. This turned out to actually improve performance (since it avoided said spikes), and avoid timeouts. (While it is true that, given a perfect scheduler, the total runtime of N_1..100 being kicked off all at once should be equal to N_1..100 being kicked off serially, it's quite likely that doing the former will mean at least a few of those 100 operations hitting its *individual* timeout at the LRM level.) I'm convinced this useful. I'll add PCMK_MAX_CHILDREN to the sysconfig documentation. To be backwards compatible I'll have the lrmd internally interpret your LRMD_MAX_CHILDREN environment variable as well. sound reasonable? The TE doesn't have enough knowledge to enforce this, since it doesn't know if monitors get scheduled. The transition timers weren't really a problem, since they had some lee-way accounted for. If we don't have this functionality right now anymore, I do believe we need it back. I do seem to recall that at the time, Andrew preferred it to be implemented at the LRM level, because it avoided a more complex transition graph logic (e.g., the batch-limit functionality on a per-node level, and doing something smart about monitors); but my memory is hazy on this detail. Nowadays, since we have the migration-threshold anyway, it may be possible to do something about it cleanly in the TE, but that still would leave the monitors unsolved ... Regards, Lars (PS: 1.1.8 really isn't turning out to be my favorite release. If I wasn't afraid it'd received as a rant, I'd try to write up a post-mortem from my/our perspective to see what might be avoidable in the future.) We should open this discussion at some point. As long as it is constructive criticism I doubt it will be perceived as a rant. I've mentioned to Andrew that we might need to consider doing release candidates. This would at least put some of the responsibility back on the community to verify the release with us before we officially tag it. We definitely test our code, but it is impossible for us to test everyone's possible deployment use-case. -- Vossel -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Same host displayed twice in crm status
- Original Message - From: Nicolas J. nikkro70+pacema...@gmail.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Tuesday, April 2, 2013 2:07:14 AM Subject: Re: [Pacemaker] Same host displayed twice in crm status I've already tried to remove the node following the document: http://clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-node-delete.html with the following commands: crm_node -R COROSYNC_ID cibadmin --delete --obj_type nodes --crm_xml 'node uname= VMTESTORADG2.it.dbi-services.com /' cibadmin --delete --obj_type status --crm_xml 'node_state uname= VMTESTORADG2.it.dbi-services.com /' I also tried to delete the status before deleting the node with the same results. I have the same issue, the node is deleted but when corosync is restarted on any node, the deleted name appears again. The server with the changed hostname has been restarted so I don't know where the reference to the old name can be saved except in the cluster. Unfortunately this looks like a bug. It sounds like the crm_node -R option isn't properly discarding the node cache, which means every time a new policy engine transition is generated that node slips back in. I'm not sure what to tell you. I'm guessing the only way to fix this is to shutdown the cluster entirely making sure their is on mention of the old node in the config on startup, or trying a new version of pacemaker. I'm sure neither of those solutions are what you want to hear though. Maybe someone else has some better advice who has encountered this with your version. -- Vossel Regarding the version, here are the details: - Corosync 1.2.7-1.1.el5 - Pacemaker 1.1.5-1.1.el5 2013/4/1 David Vossel dvos...@redhat.com - Original Message - From: Nicolas J. nikkro70+pacema...@gmail.com To: pacemaker@oss.clusterlabs.org Sent: Friday, March 29, 2013 8:55:30 AM Subject: [Pacemaker] Same host displayed twice in crm status Hi, I have a problem with a Corosync/Pacemaker configuration. One host of the cluster has been renamed and now the host is displayed twice in the configuration. When I try to remove the host from the configuration it works but if corosync is restarted on one node, the old host appears again. I tried several ways to delete the host with no effect. How can I delete the wrong host? For the pacemaker version you are using, try deleting the node from the configuration in both the node and status sections, then use crm_node -R option to remove the node from the cluster's internal cache. In pacemaker versions = 1.1.8 only the crm_node -R option is required to remove a node. -- Vossel I checked the Linux configuration and there is no place where the old name is referenced. It's an OEL/Red Hat linux. Output - [root@vmtestoradg2 ~]# crm status Last updated: Fri Mar 29 14:51:56 2013 Stack: openais Current DC: vmtestoradg1 - partition with quorum Version: 1.1.5-1.1.el5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f 4 Nodes configured, 3 expected votes 1 Resources configured. Online: [ vmtestoradg1 vmtestora10g01 vmtestoradg2 ] OFFLINE: [ VMTESTORADG2.it.dbi-services.com ] DG_IP (ocf::heartbeat:IPaddr2): Started vmtestoradg1 [root@vmtestoradg2 ~]# crm node clearstate VMTESTORADG2.it.dbi-services.com Do you really want to drop state for node VMTESTORADG2.it.dbi-services.com ? y [root@vmtestoradg2 ~]# crm node delete VMTESTORADG2.it.dbi-services.com INFO: node VMTESTORADG2.it.dbi-services.com not found by crm_node INFO: node VMTESTORADG2.it.dbi-services.com deleted Thanks in advance Best Regards, Nicolas J. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: