Re: [Pacemaker] node status does not change even if pacemakerd dies
(12.12.05 02:02), David Vossel wrote: - Original Message - From: Kazunori INOUE inouek...@intellilink.co.jp To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Monday, December 3, 2012 11:41:56 PM Subject: Re: [Pacemaker] node status does not change even if pacemakerd dies (12.12.03 20:24), Andrew Beekhof wrote: On Mon, Dec 3, 2012 at 8:15 PM, Kazunori INOUE inouek...@intellilink.co.jp wrote: (12.11.30 23:52), David Vossel wrote: - Original Message - From: Kazunori INOUE inouek...@intellilink.co.jp To: pacemaker@oss pacemaker@oss.clusterlabs.org Sent: Friday, November 30, 2012 2:38:50 AM Subject: [Pacemaker] node status does not change even if pacemakerd dies Hi, I am testing the latest version. - ClusterLabs/pacemaker 9c13d14640(Nov 27, 2012) - corosync 92e0f9c7bb(Nov 07, 2012) - libqb 30a7871646(Nov 29, 2012) Although I killed pacemakerd, node status did not change. [dev1 ~]$ pkill -9 pacemakerd [dev1 ~]$ crm_mon : Stack: corosync Current DC: dev2 (2472913088) - partition with quorum Version: 1.1.8-9c13d14 2 Nodes configured, unknown expected votes 0 Resources configured. Online: [ dev1 dev2 ] [dev1 ~]$ ps -ef|egrep 'corosync|pacemaker' root 11990 1 1 16:05 ?00:00:00 corosync 496 12010 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/cib root 12011 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/stonithd root 12012 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/lrmd 496 12013 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/attrd 496 12014 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/pengine 496 12015 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/crmd We want the node status to change to OFFLINE(stonith-enabled=false), UNCLEAN(stonith-enabled=true). That is, we want the function of this deleted code. https://github.com/ClusterLabs/pacemaker/commit/dfdfb6c9087e644cb898143e198b240eb9a928b4 How are you launching pacemakerd? The systemd service script relaunches pacemakerd on failure and pacemakerd has the ability to attach to all the old processes if they are still around as if nothing happened. -- Vossel Hi David, We are using RHEL6 and use it for a while after this. Therefore, I start it by the following commands. $ /etc/init.d/pacemakerd start or $ service pacemaker start Ok. Are you using the pacemaker plugin? When using cman or corosync 2.0, pacemakerd isn't strictly needed for normal operation. Its only there to shutdown and/or respawn failed components. We are using corosync 2.1, so service does not stop normally after pacemakerd died. $ pkill -9 pacemakerd $ service pacemaker stop $ echo $? 0 $ ps -ef|egrep 'corosync|pacemaker' root 3807 1 0 13:10 ?00:00:00 corosync 496 3827 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/cib root 3828 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/stonithd root 3829 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/lrmd 496 3830 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/attrd 496 3831 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/pengine 496 3832 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/crmd Ah yes, that is a problem. Having pacemaker still running when the init script says it is down... that is bad. Perhaps we should just make the init script smart enough to check to make sure all the pacemaker components are down after pacemakerd is down. The argument of whether or not the failure of pacemakerd is something that the cluster should be alerted to is something i'm not sure about. With the corosync 2.0 stack, pacemakerd really doesn't do anything except launch processes/relaunch processes. A cluster can be completely functional without a pacemakerd instance running anywhere. If any of the actual pacemaker components on a node fail, the logic that causes that node to get fenced has nothing to do with pacemakerd. -- Vossel Hi, I think that relaunch processes of pacemakerd is a very useful function, so I want to avoid management of a resource in the node in which pacemakerd does not exist. Though the best solution is to relaunch pacemakerd, if it is difficult, I think that a shortcut method is to make a node unclean. And now, I tried Upstart a little bit. 1) started the corosync and pacemaker. $ cat /etc/init/pacemaker.conf respawn script [ -f /etc/sysconfig/pacemaker ] { . /etc/sysconfig/pacemaker } exec /usr/sbin/pacemakerd end script $ service co start Starting Corosync Cluster Engine (corosync): [ OK ] $ initctl start pacemaker pacemaker start/running, process 4702 $ ps -ef|egrep 'corosync|pacemaker' root 4695 1 0
Re: [Pacemaker] Getting Started
Ok, almost there :) I'm having some trouble with VIPs either not starting or starting on the wrong node (so something isn't right :)). Lab04 should be the master (vipMaster), lab05 slave (vipSlave) (Postgres is up and running as a replication slave on lab05, although it's being reported as stopped...) Output from crm_mon -Af Last updated: Wed Dec 5 09:35:58 2012 Last change: Wed Dec 5 09:35:57 2012 via crm_attribute on lab04 Stack: openais Current DC: lab04 - partition with quorum Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14 2 Nodes configured, 2 expected votes 6 Resources configured. Online: [ lab05 lab04 ] Master/Slave Set: msPostgreSQL [pgsql] Masters: [ lab04 ] Stopped: [ pgsql:1 ] vipSlave(ocf::heartbeat:IPaddr2): Started lab04 Clone Set: clnPingCheck [pingCheck] Started: [ lab04 ] Stopped: [ pingCheck:1 ] vipMaster (ocf::heartbeat:IPaddr2): Started lab04 Node Attributes: * Node lab05: + master-pgsql:0: -INFINITY + master-pgsql:1: 100 + pgsql-data-status : STREAMING|SYNC + pgsql-status : STOP * Node lab04: + master-pgsql:0: 1000 + pgsql-data-status : LATEST + pgsql-master-baseline : 0A000200 + pgsql-status : PRI + pingNodes : 200 Migration summary: * Node lab04: * Node lab05: How do I migrate vipSalve to node lab05? I've tried # crm resource migrate vipSlave lab05 I did find this in the corosync log Dec 05 09:35:58 [2064] lab04pengine: notice: unpack_rsc_op: Operation monitor found resource vipMaster active on lab04 Dec 05 09:35:58 [2064] lab04pengine: notice: unpack_rsc_op: Operation monitor found resource pgsql:0 active in master mode on lab04 Dec 05 09:35:58 [2064] lab04pengine: notice: unpack_rsc_op: Operation monitor found resource vipSlave active on lab04 Dec 05 09:35:58 [2064] lab04pengine: notice: unpack_rsc_op: Operation monitor found resource pingCheck:0 active on lab04 Dec 05 09:35:58 [2064] lab04pengine: notice: unpack_rsc_op: Operation monitor found resource pgsql:1 active on lab05 Dec 05 09:35:58 [2064] lab04pengine: warning: common_apply_stickiness: Forcing clnPingCheck away from lab05 after 1 failures (max=1) Dec 05 09:35:58 [2064] lab04pengine: warning: common_apply_stickiness: Forcing clnPingCheck away from lab05 after 1 failures (max=1) If it helps, pingCheck config: primitive pingCheck ocf:pacemaker:ping \ params \ name=pingNodes \ host_list=192.168.0.12 192.168.0.13 \ multiplier=100 \ op start interval=0 timeout=60s on-fail=restart \ op monitor interval=10 timeout=60s on-fail=restart \ op stop interval=0 timeout=60s on-fail=ignore Thanks again, Brett ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Difference between crm resource and crm_resource
Hi, Can someone please explain how the commands - crm resource stop resource name and crm_resource --resource resource name --set-parameter target-role --meta --parameter-value Stopped are different? Also, I see that crm has a -w option (which gives synchronous behaviour to the command) Is there something similar for crm_resource? Thanks, Pavan ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Difference between crm resource and crm_resource
On 2012-12-05T16:51:14, pavan tc pavan...@gmail.com wrote: Hi, Can someone please explain how the commands - crm resource stop resource name and crm_resource --resource resource name --set-parameter target-role --meta --parameter-value Stopped are different? They are not. crm shell just provides a more coherent wrapper around the various commands. Also, I see that crm has a -w option (which gives synchronous behaviour to the command) Is there something similar for crm_resource? No. crm shell then watches the DC until the transition triggered by the change has completed. crm_resource just modifies the configuration. Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Difference between crm resource and crm_resource
They are not. crm shell just provides a more coherent wrapper around the various commands. Also, I see that crm has a -w option (which gives synchronous behaviour to the command) Is there something similar for crm_resource? No. crm shell then watches the DC until the transition triggered by the change has completed. crm_resource just modifies the configuration. Thanks much. Pavan Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
Hi, This is the first step - the support of restart-origin for order constraint along with the test cases: https://github.com/gao-yan/pacemaker/commits/restart-origin It looks straight-forward to me. Hope I didn't miss anything ;-) If restart-origin=true combines with kind=Optional, it just means Optional. So that a failed nagios resource would not affect the vm. I'm not sure if we should relate the restarts count with the migration-threshold of the basic resource. Even without this, users can specify how many failures of a particular nagios resource they can tolerate on a node, the vm will migrate with it anyway. And probably we could have one of the nagios resources, no matter how many times it fails, we just don't want the vm to migrate because of it. On 12/05/12 06:05, Lars Marowsky-Bree wrote: On 2012-12-04T14:48:50, David Vossel dvos...@redhat.com wrote: The resource ordered set with the 'restart-origin' option gets us half way there in the constraint definition. We still have to build the colocation set between the vm and the resources so everything runs on the same node (perhaps I just assumed that was necessary, correct me if I am wrong) Right, we end up with two resource sets. (Unless we allow the restart-origin to be set for the order constraints that are implicit if a colocation resource set is used with sequential=true. Ouch.) Ouch The above is usable, but it requires the user to explicitly set up and manage multiple constraint definitions. It seems to me like we will eventually want to simplify this process. When that time comes, I just want to make sure we approach building the simplified abstraction at the configuration level and have the management tools (crm/pcs) be a transparent extension of whatever we come up with. For what it is worth, I'd agree with this; the fact that the most common constraints are order *AND* colocation and we don't have a (link|chain|join) statement that adequately provides that has been annoying me for a while. ;-) I massively appreciate that we do have the separate dimensions, and people use that - but still, the combination of both is extremely common. The independent order + colocation statements do allow for that though; and in theory, a frontend *could* detect that there's both A first, then B and B where A is with the same priority and present it merged as: join id-494 inf: A B Looks neat :-) Regards, Gao,Yan -- Gao,Yan y...@suse.com Software Engineer China Server Team, SUSE. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Getting Started
Brett, The ocf:heartbeat:pingd resource agent is used to monitor network availability. This resource agent is actually deprecated - the recommended replacement is ocf:pacemaker:pingd. You can use ocf:pacemaker:pingd with a location constraint to move resources away from a node if it loses network connectivity. For example, to move the group of resources g_resources away from a node that loses network connectivity: primitive p_ping ocf:pacemaker:ping \ params name=p_ping host_list=192.168.0.11 192.168.0.12 dampen=10s multiplier=10 \ op start interval=0 timeout=60 \ op monitor interval=10s timeout=60 clone cl_ping p_ping \ meta interleave=true location loc_run_on_most_connected g_resources \ rule $id=loc_run_on_most_connected-rule -inf: not_defined p_ping or p_ping lte 0 This location constraint will migrate resources away from a node which can't ping any of the hosts defined in p_ping. Andrew - Original Message - From: Brett Maton brett.ma...@googlemail.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Tuesday, December 4, 2012 5:56:12 AM Subject: Re: [Pacemaker] Getting Started The group master-group has me a bit stumped as I'm not using a VIP for replication: group master-group \ vip-master \ vip-rep \ meta \ ordered=false I'm guessing that I don't need to define the group as it would effectively only contain the master VIP? Therefore the colocation rule 2 should directly reference vip-master: colocation rsc_colocation-2 inf: vip-master msPostgresql:Master And the order rules the same: order rsc_order-1 0: clnPingCheck msPostgresql order rsc_order-2 0: msPostgresql:promote vip-master:start symmetrical=false order rsc_order-3 0: msPostgresql:demote vip-master:stop symmetrical=false Now ignorance time :) What's the point of pinging the gateway or am really just being daft (I had to change pacemaker to heartbeat here)? primitive pingCheck ocf:heartbeat:pingd \ params \ name=default_ping_set \ host_list=192.168.0.254 \ multiplier=100 \ op start timeout=60s interval=0s on-fail=restart \ op monitor timeout=60s interval=10s on-fail=restart \ op stoptimeout=60s interval=0s on-fail=ignore Thanks for your patience, Brett -Original Message- From: Takatoshi MATSUO [mailto:matsuo@gmail.com] Sent: 04 December 2012 00:25 To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Getting Started Hi Brett Did you see my sample configuration? https://github.com/t-matsuo/resource-agents/wiki/Resource-Agent-for-PostgreSQL-9.1-streaming-replication 2012/12/4 Brett Maton brett.ma...@googlemail.com: On 3 Dec 2012, at 15:01, Florian Crouzat wrote: Le 03/12/2012 15:24, Brett Maton a écrit : Hi List, I'm new to corosync / pacemaker so please forgive my ignorance! I currently have Postgres streaming replication between node1(master) and node2(slave, hot standby), the replication user authenticates to master using an md5 password. All good there... My goal use pacemaker / heartbeat to move VIP and promote node2 if node1 fails, without using drdb or pg-pool. What I'm having trouble with is finding resources for learning what I need to configure with regards to corosync / pacemaker to implement failover. All of the guides I've found use DRDB and/or a much more robust network configuration. I'm currently using CentOS 6.3 with PostgreSQL 9.2 corosync-1.4.1-7.el6_3.1.x86_64 pacemaker-1.1.7-6.el6.x86_64 node1192.168.0.1 node2192.168.0.2 dbVIP192.168.0.101 Any help and suggested reading appreciated. Thanks in advance, Brett Well, if you don't need shared storage and only a VIP over which postgres runs, I guess the official guide should be good: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-crmsh/html-single/Clus ters_from_Scratch/ Forget the drdb stuff, and base your configuration on the httpd examples that collocates a VIP and an httpd daemon in an active/passive two nodes cluster. (Chapter 6). -- Cheers, Florian Crouzat Thanks for the answers, I'll try again using crm / cman. I've found a pgsql agent patched by Takatoshi Matsuo which should work if I can figure out the configuration! Part of the problem I think is that PostgreSQL streaming replication is kind of active / active insofar as the slave is up and listening (and receiving updates from the master) in read only mode. The default agent from the CentOS repositories kills the slave if the master is up, which means that replication doesn't happen as there is no slave to receive updates :) Thanks, Brett ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
- Original Message - From: Yan Gao y...@suse.com To: pacemaker@oss.clusterlabs.org Sent: Wednesday, December 5, 2012 6:27:05 AM Subject: Re: [Pacemaker] Enable remote monitoring Hi, This is the first step - the support of restart-origin for order constraint along with the test cases: https://github.com/gao-yan/pacemaker/commits/restart-origin It looks straight-forward to me. Hope I didn't miss anything ;-) I had made some in-line comments for you in git-hub. It looks like you are on the right track. I'm just not sure about the symmetrical=false use case for order constraints. If restart-origin=true combines with kind=Optional, it just means Optional. So that a failed nagios resource would not affect the vm. I agree, restart-origin is a no-op for advisory ordering. I'm not sure if we should relate the restarts count with the migration-threshold of the basic resource. I don't know what the basic resource refers to here. If we are talking about counting the restarts of the vm towards the migration-threshold, I'd expect the vm to have the same behavior as whatever happens to 'B' right now for the use-case below. Start A then Start B. When A fails restart B. Start vm then Start nagios. When nagios fails restart vm. Even without this, users can specify how many failures of a particular nagios resource they can tolerate on a node, the vm will migrate with it anyway. And probably we could have one of the nagios resources, no matter how many times it fails, we just don't want the vm to migrate because of it. I don't understand this last sentence. -- Vossel On 12/05/12 06:05, Lars Marowsky-Bree wrote: On 2012-12-04T14:48:50, David Vossel dvos...@redhat.com wrote: The resource ordered set with the 'restart-origin' option gets us half way there in the constraint definition. We still have to build the colocation set between the vm and the resources so everything runs on the same node (perhaps I just assumed that was necessary, correct me if I am wrong) Right, we end up with two resource sets. (Unless we allow the restart-origin to be set for the order constraints that are implicit if a colocation resource set is used with sequential=true. Ouch.) Ouch The above is usable, but it requires the user to explicitly set up and manage multiple constraint definitions. It seems to me like we will eventually want to simplify this process. When that time comes, I just want to make sure we approach building the simplified abstraction at the configuration level and have the management tools (crm/pcs) be a transparent extension of whatever we come up with. For what it is worth, I'd agree with this; the fact that the most common constraints are order *AND* colocation and we don't have a (link|chain|join) statement that adequately provides that has been annoying me for a while. ;-) I massively appreciate that we do have the separate dimensions, and people use that - but still, the combination of both is extremely common. The independent order + colocation statements do allow for that though; and in theory, a frontend *could* detect that there's both A first, then B and B where A is with the same priority and present it merged as: join id-494 inf: A B Looks neat :-) Regards, Gao,Yan -- Gao,Yan y...@suse.com Software Engineer China Server Team, SUSE. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
On 12/06/12 00:36, David Vossel wrote: - Original Message - From: Yan Gao y...@suse.com To: pacemaker@oss.clusterlabs.org Sent: Wednesday, December 5, 2012 6:27:05 AM Subject: Re: [Pacemaker] Enable remote monitoring Hi, This is the first step - the support of restart-origin for order constraint along with the test cases: https://github.com/gao-yan/pacemaker/commits/restart-origin It looks straight-forward to me. Hope I didn't miss anything ;-) I had made some in-line comments for you in git-hub. It looks like you are on the right track. Thanks! I'm just not sure about the symmetrical=false use case for order constraints. A symmetrical=false implies we don't care about the inverse order. AFAICS, we shouldn't still restart the origin for this case. If restart-origin=true combines with kind=Optional, it just means Optional. So that a failed nagios resource would not affect the vm. I agree, restart-origin is a no-op for advisory ordering. I'm not sure if we should relate the restarts count with the migration-threshold of the basic resource. I don't know what the basic resource refers to here. The origin. If we are talking about counting the restarts of the vm towards the migration-threshold, Yep I'd expect the vm to have the same behavior as whatever happens to 'B' right now for the use-case below. Start A then Start B. When A fails restart B. Start vm then Start nagios. When nagios fails restart vm. Sure, we have the behaviors with the code. I think we are talking about the failure count of the VM should only affected by its own monitor, or also by the resources within it. Even without this, users can specify how many failures of a particular nagios resource they can tolerate on a node, the vm will migrate with it anyway. And probably we could have one of the nagios resources, no matter how many times it fails, we just don't want the vm to migrate because of it. I don't understand this last sentence. If we didn't set a migration-threshold for a nagios resource, that means we could always allow it to recover on a node if possible. BTW, I believe we usually put new options into the 1.2.rng to settle for a bit before promoting them into the 1.1 scheme. We changed the rule? We used to put them in 1.1 first and promote into 1.2 later when I did the other features. AFAIK, validate-with is initially set to pacemaker-1.2, which means users would get the feature immediately, no? Regards, Gao,Yan -- Gao,Yan y...@suse.com Software Engineer China Server Team, SUSE. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
- Original Message - From: Yan Gao y...@suse.com To: pacemaker@oss.clusterlabs.org Sent: Wednesday, December 5, 2012 12:00:57 PM Subject: Re: [Pacemaker] Enable remote monitoring On 12/06/12 00:36, David Vossel wrote: - Original Message - From: Yan Gao y...@suse.com To: pacemaker@oss.clusterlabs.org Sent: Wednesday, December 5, 2012 6:27:05 AM Subject: Re: [Pacemaker] Enable remote monitoring Hi, This is the first step - the support of restart-origin for order constraint along with the test cases: https://github.com/gao-yan/pacemaker/commits/restart-origin It looks straight-forward to me. Hope I didn't miss anything ;-) I had made some in-line comments for you in git-hub. It looks like you are on the right track. Thanks! I'm just not sure about the symmetrical=false use case for order constraints. A symmetrical=false implies we don't care about the inverse order. AFAICS, we shouldn't still restart the origin for this case. Yeah, I suppose you are right. I wouldn't have thought of these two options as being related, but we need that inverse constraint to force the restart of A. Utilizing the inverse order constraint internally makes the implementation of this option much cleaner than it would be otherwise. I have no idea why someone would want to do this... but what would happen with the following. start A then promote B restart-origin=true would A get restarted when B is demoted... or when B fails/stops? If restart-origin=true combines with kind=Optional, it just means Optional. So that a failed nagios resource would not affect the vm. I agree, restart-origin is a no-op for advisory ordering. I'm not sure if we should relate the restarts count with the migration-threshold of the basic resource. I don't know what the basic resource refers to here. The origin. If we are talking about counting the restarts of the vm towards the migration-threshold, Yep I'd expect the vm to have the same behavior as whatever happens to 'B' right now for the use-case below. Start A then Start B. When A fails restart B. Start vm then Start nagios. When nagios fails restart vm. Sure, we have the behaviors with the code. I think we are talking about the failure count of the VM should only affected by its own monitor, or also by the resources within it. I see. Mapping the failcount of one resource to another resource seems like it would be difficult for us to represent in the configuration without using some sort of container group like object where the parent resource inherited failures from the children. Even without this, users can specify how many failures of a particular nagios resource they can tolerate on a node, the vm will migrate with it anyway. And probably we could have one of the nagios resources, no matter how many times it fails, we just don't want the vm to migrate because of it. I don't understand this last sentence. If we didn't set a migration-threshold for a nagios resource, that means we could always allow it to recover on a node if possible. BTW, I believe we usually put new options into the 1.2.rng to settle for a bit before promoting them into the 1.1 scheme. We changed the rule? We used to put them in 1.1 first and promote into 1.2 later when I did the other features. AFAIK, validate-with is initially set to pacemaker-1.2, which means users would get the feature immediately, no? Ah, I got the whole thing backwards. You are correct. Sorry :) -- Vossel Regards, Gao,Yan -- Gao,Yan y...@suse.com Software Engineer China Server Team, SUSE. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
On 12/06/12 04:52, David Vossel wrote: Hi, This is the first step - the support of restart-origin for order constraint along with the test cases: https://github.com/gao-yan/pacemaker/commits/restart-origin It looks straight-forward to me. Hope I didn't miss anything ;-) I had made some in-line comments for you in git-hub. It looks like you are on the right track. Thanks! I'm just not sure about the symmetrical=false use case for order constraints. A symmetrical=false implies we don't care about the inverse order. AFAICS, we shouldn't still restart the origin for this case. Yeah, I suppose you are right. I wouldn't have thought of these two options as being related, but we need that inverse constraint to force the restart of A. Utilizing the inverse order constraint internally makes the implementation of this option much cleaner than it would be otherwise. I have no idea why someone would want to do this... but what would happen with the following. start A then promote B restart-origin=true would A get restarted when B is demoted... or when B fails/stops? Hmm, you are right. I missed that somehow. We should rethink how to implement it in a more proper way. If restart-origin=true combines with kind=Optional, it just means Optional. So that a failed nagios resource would not affect the vm. I agree, restart-origin is a no-op for advisory ordering. I'm not sure if we should relate the restarts count with the migration-threshold of the basic resource. I don't know what the basic resource refers to here. The origin. If we are talking about counting the restarts of the vm towards the migration-threshold, Yep I'd expect the vm to have the same behavior as whatever happens to 'B' right now for the use-case below. Start A then Start B. When A fails restart B. Start vm then Start nagios. When nagios fails restart vm. Sure, we have the behaviors with the code. I think we are talking about the failure count of the VM should only affected by its own monitor, or also by the resources within it. I see. Mapping the failcount of one resource to another resource seems like it would be difficult for us to represent in the configuration without using some sort of container group like object where the parent resource inherited failures from the children. Indeed. Even without this, users can specify how many failures of a particular nagios resource they can tolerate on a node, the vm will migrate with it anyway. And probably we could have one of the nagios resources, no matter how many times it fails, we just don't want the vm to migrate because of it. I don't understand this last sentence. If we didn't set a migration-threshold for a nagios resource, that means we could always allow it to recover on a node if possible. BTW, I believe we usually put new options into the 1.2.rng to settle for a bit before promoting them into the 1.1 scheme. We changed the rule? We used to put them in 1.1 first and promote into 1.2 later when I did the other features. AFAIK, validate-with is initially set to pacemaker-1.2, which means users would get the feature immediately, no? Ah, I got the whole thing backwards. You are correct. Sorry :) No problem :-) Regards, Gao,Yan -- Gao,Yan y...@suse.com Software Engineer China Server Team, SUSE. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
On 04/12/2012, at 9:20 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-12-03T16:32:14, David Vossel dvos...@redhat.com wrote: + optional + attribute name=restart-origindata type=boolean//attribute + /optional I don't feel strongly about this. Here's what comes to mind for me. force-recover - force recovery of both sides of the constraint if either side fails We actually have a precedent here - grep for restart_type. ;-) Not a very good precedent though. I believe it was bad enough that I removed it from the docs. Although I would support refining the idea into something a bit saner that also handled this case. force-recover doesn't quite fit for me; because for the first-then distinction, we already have the 0 versus inf score differentiation. What would force-recover do for score=0? (Ohhh. Did we just find a use for a negative score here? ;-) Just throwing that out there. It'd fit the model we have so far, is all I'm saying.) Scores are deprecated for ordering constraints. Mostly because they make no sense :-) We do have 'kind' though. But - I think restarting alone doesn't suffice. Do we want to have the restarts count towards the migration-threshold of the parent resource too? I think we may want that. If we want to stick with the terminology, restart-first (but -origin sounds better, so I don't feel that strongly either) as a tri-state (no (default), yes, treat-as-failure (anyone got a snappy idea for that one?) might make be advisable. What about inherit-failure = true|false? Here's a thought. Add the new constraint flag as well as a new option on the primitive that escalates failures to the parent resource (pretty sure this idea isn't mine, maybe Andrew threw it at me a few weeks ago) Then you could do something like this. primitive vm group vm-resources primitive nagios-monitor-foo primitive nagios-monitor-bar order vm then vm-resources reset-origin colocation vm vm-resources. It isn't as simple (configuration wise, not implementation wise) as the container concept, but at least this way you don't have to build relationships between the vm and every resource in it explicitly. It seems like leveraging groups here would be a good idea. One the one hand, this makes a lot of sense. But when we're going this far, why not directly: group vm-with-resources vm nagios-monitor-foo nagios-monitor-bar \ meta restart-origin=true Right. Apart from the name. Really not crazy on origin. There's really nothing (apart from a naming convention) to suggest that origin means the vm resource. If we go this way, how about something like propagate-failure=bool or delegate-failure=bool, or just simply: failure-delegate=${resource_name}. The last one is probably now my favourite, even a trained monkey should be able to figure out what that construct implies :) ? All we'd be doing is flip the restart-origin bit on the orders implicit in the group. Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
On 05/12/2012, at 3:45 AM, David Vossel dvos...@redhat.com wrote: - Original Message - From: Lars Marowsky-Bree l...@suse.com To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Tuesday, December 4, 2012 6:59:08 AM Subject: Re: [Pacemaker] Enable remote monitoring On 2012-12-04T19:48:18, Gao,Yan y...@suse.com wrote: Yes, I think this looks good. The patch to the schema I proposed supports this already ;-) So it seems that nobody had any serious objections to this approach, but we were fiddling with details and can't actually decide what we like better, if anything. ;-) So why don't we proceed with the initial suggestion for now implement it (restart-origin attribute on order constraint), play with it for 1-2 months and fine tune it in practice? Andrew, David, any objections to that? The main thing I want to avoid is an explosion of order and colocation constraints in the configuration to support this functionality. Pushing this off to configuration management tools like crmsh and psd may help people avoid configuration mistakes, but when it comes to actually debugging what's going on, having a huge constraint section in the config is a nightmare. Using concepts like groups and sets make these relationships more obvious. I would really like us to move forward with some sort of abstraction that allows the relationship between the virtual machine and resources within it to be defined as simple as possible. I am okay with this constraint option being implemented, as it is the basis for this whole concept. When it comes time to make this usable, don't make the abstraction people use to configure this relationship live at the crm shell... meaning, Don't introduce the idea of a container object in the shell which then goes off and explodes the constraint section under the hood. Personally, I'm ok with the shell exploding things. A 1-1 mapping between XML and crmsh isn't strictly necessary. My original motivations for the container concept were two-fold 1) obviousness to the user 2) implementation flexibility for developers If we expose a new constraint or resource option, then 2. goes out the window and 1. doesn't strictly need to be done at the XML level anymore. There was also 3) Limited scope for misuse :) But I'm reasonably confident with 'failure-delegate' on that front. Think this through and come up with a plan to represent what is going on at the configuration level. -- Vossel Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
On 05/12/2012, at 4:05 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-12-04T11:45:16, David Vossel dvos...@redhat.com wrote: I am okay with this constraint option being implemented, as it is the basis for this whole concept. When it comes time to make this usable, don't make the abstraction people use to configure this relationship live at the crm shell... meaning, Don't introduce the idea of a container object in the shell which then goes off and explodes the constraint section under the hood. Think this through and come up with a plan to represent what is going on at the configuration level. A resource set already is defined in the constraint section, like Yan said. That seems to do what you ask for? We have the primitives etc defined in the resources section and then glue them together in the constraints; that's as intended. Objects and their relationships. Is there something you don't like about Yan's proposal? Sorry for asking a dumb question, but I can't tell from the above what you'd like to see changed. How would you make this more usable? Yes, a frontend might decide to render resource sets special (more like how groups are handled[1]), but I'm not sure I understand what you're suggesting. Regards, Lars [1] and it'd perhaps even be cleaner if, indeed, we had resource sets instead of groups, and could reference them as aggregates as well. But that may be a different discussion. I would very much like to ditch groups for sets, but there are still some things I just can't get to work without the group pseudo resource. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
On 05/12/2012, at 11:27 PM, Gao,Yan y...@suse.com wrote: Hi, This is the first step - the support of restart-origin for order constraint along with the test cases: https://github.com/gao-yan/pacemaker/commits/restart-origin It looks straight-forward to me. Hope I didn't miss anything ;-) If restart-origin=true combines with kind=Optional, it just means Optional. So that a failed nagios resource would not affect the vm. I'm not sure if we should relate the restarts count with the migration-threshold of the basic resource. Even without this, users can specify how many failures of a particular nagios resource they can tolerate on a node, the vm will migrate with it anyway. Does that make sense though? You've not achieved anything a restart wouldn't have done. The choice to move the VM should be up to the VM. And probably we could have one of the nagios resources, no matter how many times it fails, we just don't want the vm to migrate because of it. On 12/05/12 06:05, Lars Marowsky-Bree wrote: On 2012-12-04T14:48:50, David Vossel dvos...@redhat.com wrote: The resource ordered set with the 'restart-origin' option gets us half way there in the constraint definition. We still have to build the colocation set between the vm and the resources so everything runs on the same node (perhaps I just assumed that was necessary, correct me if I am wrong) Right, we end up with two resource sets. (Unless we allow the restart-origin to be set for the order constraints that are implicit if a colocation resource set is used with sequential=true. Ouch.) Ouch The above is usable, but it requires the user to explicitly set up and manage multiple constraint definitions. It seems to me like we will eventually want to simplify this process. When that time comes, I just want to make sure we approach building the simplified abstraction at the configuration level and have the management tools (crm/pcs) be a transparent extension of whatever we come up with. For what it is worth, I'd agree with this; the fact that the most common constraints are order *AND* colocation and we don't have a (link|chain|join) statement that adequately provides that has been annoying me for a while. ;-) I massively appreciate that we do have the separate dimensions, and people use that - but still, the combination of both is extremely common. The independent order + colocation statements do allow for that though; and in theory, a frontend *could* detect that there's both A first, then B and B where A is with the same priority and present it merged as: join id-494 inf: A B Looks neat :-) Regards, Gao,Yan -- Gao,Yan y...@suse.com Software Engineer China Server Team, SUSE. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
On 06/12/2012, at 5:00 AM, Gao,Yan y...@suse.com wrote: On 12/06/12 00:36, David Vossel wrote: - Original Message - From: Yan Gao y...@suse.com To: pacemaker@oss.clusterlabs.org Sent: Wednesday, December 5, 2012 6:27:05 AM Subject: Re: [Pacemaker] Enable remote monitoring Hi, This is the first step - the support of restart-origin for order constraint along with the test cases: https://github.com/gao-yan/pacemaker/commits/restart-origin It looks straight-forward to me. Hope I didn't miss anything ;-) I had made some in-line comments for you in git-hub. It looks like you are on the right track. Thanks! I'm just not sure about the symmetrical=false use case for order constraints. A symmetrical=false implies we don't care about the inverse order. AFAICS, we shouldn't still restart the origin for this case. symmetrical=false makes no sense here. If we stay with the restart-origin approach, then is should override symmetrical=false. If restart-origin=true combines with kind=Optional, it just means Optional. So that a failed nagios resource would not affect the vm. I agree, restart-origin is a no-op for advisory ordering. I'm not sure if we should relate the restarts count with the migration-threshold of the basic resource. I don't know what the basic resource refers to here. The origin. If we are talking about counting the restarts of the vm towards the migration-threshold, Yep I'd expect the vm to have the same behavior as whatever happens to 'B' right now for the use-case below. Start A then Start B. When A fails restart B. Start vm then Start nagios. When nagios fails restart vm. Sure, we have the behaviors with the code. I think we are talking about the failure count of the VM should only affected by its own monitor, or also by the resources within it. Even without this, users can specify how many failures of a particular nagios resource they can tolerate on a node, the vm will migrate with it anyway. And probably we could have one of the nagios resources, no matter how many times it fails, we just don't want the vm to migrate because of it. I don't understand this last sentence. If we didn't set a migration-threshold for a nagios resource, that means we could always allow it to recover on a node if possible. BTW, I believe we usually put new options into the 1.2.rng to settle for a bit before promoting them into the 1.1 scheme. We changed the rule? No, David has it backwards :) We used to put them in 1.1 first and promote into 1.2 later when I did the other features. AFAIK, validate-with is initially set to pacemaker-1.2, which means users would get the feature immediately, no? Regards, Gao,Yan -- Gao,Yan y...@suse.com Software Engineer China Server Team, SUSE. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
On 05/12/2012, at 9:05 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-12-04T14:48:50, David Vossel dvos...@redhat.com wrote: The resource ordered set with the 'restart-origin' option gets us half way there in the constraint definition. We still have to build the colocation set between the vm and the resources so everything runs on the same node (perhaps I just assumed that was necessary, correct me if I am wrong) Right, we end up with two resource sets. (Unless we allow the restart-origin to be set for the order constraints that are implicit if a colocation resource set is used with sequential=true. Ouch.) The above is usable, but it requires the user to explicitly set up and manage multiple constraint definitions. It seems to me like we will eventually want to simplify this process. When that time comes, I just want to make sure we approach building the simplified abstraction at the configuration level and have the management tools (crm/pcs) be a transparent extension of whatever we come up with. For what it is worth, I'd agree with this; the fact that the most common constraints are order *AND* colocation and we don't have a (link|chain|join) statement that adequately provides that has been annoying me for a while. ;-) I massively appreciate that we do have the separate dimensions, and people use that - but still, the combination of both is extremely common. Agreed. I'm still torn whether this is a GUI/shell job or something we need to add to the underlying xml. The independent order + colocation statements do allow for that though; and in theory, a frontend *could* detect that there's both A first, then B and B where A is with the same priority and present it merged as: join id-494 inf: A B Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) Experience is the name everyone gives to their mistakes. -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] pacemaker processes RSS growth
I wonder what the growth looks like with the recent libqb fix. That could be an explanation. On Sat, Sep 15, 2012 at 5:23 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 14.09.2012 09:54, Vladislav Bogdanov wrote: 13.09.2012 15:18, Vladislav Bogdanov wrote: ... and now it runs on my testing cluster. Ipc-related memory problems seem to be completely fixed now, processes own memory (RES-SHR in terms of htop) does not grow any longer (after 40 minutes). Although I see that both RES and SHR counters sometimes increase synchronously. lrmd does not grow at all. Will look again after few hours. So, lrmd is ok. I see only 4kb growth in RES-SHR on one node (current DC). Other instances are of the constant size for almost a day. I see RES-SHR growth in pacemakerd (100kb per day). So I expect some leakage here. Should I run it under valgrind? Valgrind doesn't find anything valuable here (1 and 9 hours runs). ==23851== LEAK SUMMARY: ==23851==definitely lost: 528 bytes in 3 blocks ==23851==indirectly lost: 17,361 bytes in 36 blocks ==23851== possibly lost: 234 bytes in 8 blocks ==23851==still reachable: 17,458 bytes in 163 blocks ==23851== suppressed: 0 bytes in 0 blocks And I see that both RES and SHR synchronously grow in crmd (600-700kb per day on member nodes, 6Mb on DC), while RES-SHR is reduced by 24kb on DC. And I see cib growth in both RES and SHR in range 12-340 kb, and 4kb growth in RES-SHR on nodes except DC. I can't say for sure what causes growth of shared pages. May be it is /dev/shm. Lot of files are there. I'll look if it grows. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Corosync version '1.4.4' and its compatability with Pacemaker version.
On Wed, Dec 5, 2012 at 2:46 PM, Dhiraj Hadkar dhiraj.had...@alepo.com wrote: Andrew, Thanks for your response My Questions were: which version of pacemaker Is coro 1.4.4 compatible with. Anything in the last few years. can rhel 5.4 support coro 1.4.4. I believe so. I put up some RHEL5 rpms up at http://clusterlabs.org/rpm-next Or can you advise the best combination for unicast support OS (RHEL version) corosync version pacemaker version Thanks in advance, B.R. Dhiraj From: Andrew Beekhof [and...@beekhof.net] Sent: Wednesday, December 05, 2012 6:05 AM To: The Pacemaker cluster resource manager Subject: Re: [Pacemaker] Corosync version '1.4.4' and its compatability with Pacemaker version. The level of detail is good, but is there a question too? On Tuesday, December 4, 2012, Dhiraj Hadkar wrote: Hello Andrew, I have Corosync version: [root@node1]# corosync -v Corosync Cluster Engine, version '1.4.4' Copyright (c) 2006-2009 Red Hat, Inc. [root@node2]# corosync -v Corosync Cluster Engine, version '1.4.4' Copyright (c) 2006-2009 Red Hat, Inc. This I am inclined to use for unicast support. Would this have issues when I try with RHEL 5.4, also which is the best suited Pacemaker version for this version of corosync. Details of my setup: /var/log/cluster/corosync.log contents: Dec 01 17:59:22 corosync [MAIN ] Corosync Cluster Engine ('1.4.4'): started and ready to provide service. Dec 01 17:59:22 corosync [MAIN ] Corosync built-in features: nss Dec 01 17:59:22 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'. Dec 01 17:59:22 corosync [TOTEM ] Initializing transport (UDP/IP Unicast). Dec 01 17:59:22 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Dec 01 17:59:23 corosync [TOTEM ] The network interface [172.16.202.153] is now up. Dec 01 17:59:23 corosync [SERV ] Service engine loaded: corosync extended virtual synchrony service Dec 01 17:59:23 corosync [SERV ] Service engine loaded: corosync configuration service Dec 01 17:59:23 corosync [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Dec 01 17:59:23 corosync [SERV ] Service engine loaded: corosync cluster config database access v1.01 Dec 01 17:59:23 corosync [SERV ] Service engine loaded: corosync profile loading service Dec 01 17:59:23 corosync [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Dec 01 17:59:23 corosync [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Dec 01 17:59:23 corosync [TOTEM ] adding new UDPU member {172.16.202.153} Dec 01 17:59:23 corosync [TOTEM ] adding new UDPU member {172.16.202.154} Corosync.conf: # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2 secauth: off interface { member { memberaddr: 172.16.202.153 } member { memberaddr: 172.16.202.154 } ringnumber: 0 bindnetaddr: 172.16.0.0 mcastport: 5405 ttl: 1 } transport: udpu } logging { fileline: off to_logfile: yes to_syslog: yes logfile: /var/log/cluster/corosync.log debug: off timestamp: on logger_subsys { subsys: AMF debug: off } } Verification for config creation: [root@node1]# /sbin/ifconfig eth1 | grep inet addr | awk -F: '{print $2}' | awk '{print $1}' 172.16.202.153 [root@node1]# ipcalc -n `ip addr show eth1 | grep 'inet ' |awk '{print $2}'` | awk -F= '{print $2}' 172.16.0.0 [root@node2]# /sbin/ifconfig eth1 | grep inet addr | awk -F: '{print $2}' | awk '{print $1}' 172.16.202.154 [root@node2 libqb]# ipcalc -n `ip addr show eth1 | grep 'inet ' |awk '{print $2}'` | awk -F= '{print $2}' 172.16.0.0 [root@node1]# corosync-cfgtool -s Printing ring status. Local node ID -1714810708 RING ID 0 id = 172.16.202.153 status = ring 0 active with no faults [root@node2 pacemaker]# corosync-cfgtool -s Printing ring status. Local node ID -1698033492 RING ID 0 id = 172.16.202.154 status = ring 0 active with no faults [root@node2 pacemaker]# Thanks in Advance for advise, Best Regards, Dhir This email (message and any attachment) is confidential and may be privileged. If you are not certain that you are the intended recipient, please notify the sender immediately by replying to this message, and delete all copies of this message and attachments. Any other use of this email by you is prohibited. ___ Pacemaker
Re: [Pacemaker] node status does not change even if pacemakerd dies
On Wed, Dec 5, 2012 at 8:32 PM, Kazunori INOUE inouek...@intellilink.co.jp wrote: (12.12.05 02:02), David Vossel wrote: - Original Message - From: Kazunori INOUE inouek...@intellilink.co.jp To: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Sent: Monday, December 3, 2012 11:41:56 PM Subject: Re: [Pacemaker] node status does not change even if pacemakerd dies (12.12.03 20:24), Andrew Beekhof wrote: On Mon, Dec 3, 2012 at 8:15 PM, Kazunori INOUE inouek...@intellilink.co.jp wrote: (12.11.30 23:52), David Vossel wrote: - Original Message - From: Kazunori INOUE inouek...@intellilink.co.jp To: pacemaker@oss pacemaker@oss.clusterlabs.org Sent: Friday, November 30, 2012 2:38:50 AM Subject: [Pacemaker] node status does not change even if pacemakerd dies Hi, I am testing the latest version. - ClusterLabs/pacemaker 9c13d14640(Nov 27, 2012) - corosync 92e0f9c7bb(Nov 07, 2012) - libqb 30a7871646(Nov 29, 2012) Although I killed pacemakerd, node status did not change. [dev1 ~]$ pkill -9 pacemakerd [dev1 ~]$ crm_mon : Stack: corosync Current DC: dev2 (2472913088) - partition with quorum Version: 1.1.8-9c13d14 2 Nodes configured, unknown expected votes 0 Resources configured. Online: [ dev1 dev2 ] [dev1 ~]$ ps -ef|egrep 'corosync|pacemaker' root 11990 1 1 16:05 ?00:00:00 corosync 496 12010 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/cib root 12011 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/stonithd root 12012 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/lrmd 496 12013 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/attrd 496 12014 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/pengine 496 12015 1 0 16:05 ?00:00:00 /usr/libexec/pacemaker/crmd We want the node status to change to OFFLINE(stonith-enabled=false), UNCLEAN(stonith-enabled=true). That is, we want the function of this deleted code. https://github.com/ClusterLabs/pacemaker/commit/dfdfb6c9087e644cb898143e198b240eb9a928b4 How are you launching pacemakerd? The systemd service script relaunches pacemakerd on failure and pacemakerd has the ability to attach to all the old processes if they are still around as if nothing happened. -- Vossel Hi David, We are using RHEL6 and use it for a while after this. Therefore, I start it by the following commands. $ /etc/init.d/pacemakerd start or $ service pacemaker start Ok. Are you using the pacemaker plugin? When using cman or corosync 2.0, pacemakerd isn't strictly needed for normal operation. Its only there to shutdown and/or respawn failed components. We are using corosync 2.1, so service does not stop normally after pacemakerd died. $ pkill -9 pacemakerd $ service pacemaker stop $ echo $? 0 $ ps -ef|egrep 'corosync|pacemaker' root 3807 1 0 13:10 ?00:00:00 corosync 496 3827 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/cib root 3828 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/stonithd root 3829 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/lrmd 496 3830 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/attrd 496 3831 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/pengine 496 3832 1 0 13:10 ?00:00:00 /usr/libexec/pacemaker/crmd Ah yes, that is a problem. Having pacemaker still running when the init script says it is down... that is bad. Perhaps we should just make the init script smart enough to check to make sure all the pacemaker components are down after pacemakerd is down. The argument of whether or not the failure of pacemakerd is something that the cluster should be alerted to is something i'm not sure about. With the corosync 2.0 stack, pacemakerd really doesn't do anything except launch processes/relaunch processes. A cluster can be completely functional without a pacemakerd instance running anywhere. If any of the actual pacemaker components on a node fail, the logic that causes that node to get fenced has nothing to do with pacemakerd. -- Vossel Hi, I think that relaunch processes of pacemakerd is a very useful function, so I want to avoid management of a resource in the node in which pacemakerd does not exist. You do understand that the node will be fenced if any of those processes fail right? Its not like a node could end up in a bad state if pacemakerd isn't around to respawn things. The relaunch processes is there in attempt to recover before anyone else notices. So essentially what you're asking for, is to fence the node and migrate all the resources so that in the future IF another process dies, we MIGHT not have to fence
[Pacemaker] Nodes OFFLINE with not in our membership messages
Hi, I have now hit this issue twice in my setup. I see the following github commit addressing this issue: https://github.com/ClusterLabs/pacemaker/commit/03f6105592281901cc10550b8ad19af4beb5f72f From the patch, it appears there is an incorrect conclusion about the status of the membership of nodes. Is there a root cause analysis of this issue that I can read through? I am currently using 1.1.7. Would the suggestion be to move to 1.1.8, or is there a workaround? (I have already done a good deal of testing with 1.1.7, and would like to live with it if possible) Thanks, Pavan ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Enable remote monitoring
Hi Andrew, Thanks for the comments! On 12/06/12 09:44, Andrew Beekhof wrote: On 05/12/2012, at 11:27 PM, Gao,Yan y...@suse.com wrote: Hi, This is the first step - the support of restart-origin for order constraint along with the test cases: https://github.com/gao-yan/pacemaker/commits/restart-origin It looks straight-forward to me. Hope I didn't miss anything ;-) If restart-origin=true combines with kind=Optional, it just means Optional. So that a failed nagios resource would not affect the vm. I'm not sure if we should relate the restarts count with the migration-threshold of the basic resource. Even without this, users can specify how many failures of a particular nagios resource they can tolerate on a node, the vm will migrate with it anyway. Does that make sense though? You've not achieved anything a restart wouldn't have done. The choice to move the VM should be up to the VM. If the fail-count of a nagios resource reaches its own migration-threshold, the colocated VM should migrate with it anyway, shouldn't it? I like the concept of failure-delegate. If we introduce it, it sounds more like a resource's meta/op attribute to me, rather than into order constraint or group. What do you think? Regards, Gao,Yan -- Gao,Yan y...@suse.com Software Engineer China Server Team, SUSE. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] pacemaker processes RSS growth
06.12.2012 06:05, Andrew Beekhof wrote: I wonder what the growth looks like with the recent libqb fix. That could be an explanation. Valid point. I will watch. On Sat, Sep 15, 2012 at 5:23 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 14.09.2012 09:54, Vladislav Bogdanov wrote: 13.09.2012 15:18, Vladislav Bogdanov wrote: ... and now it runs on my testing cluster. Ipc-related memory problems seem to be completely fixed now, processes own memory (RES-SHR in terms of htop) does not grow any longer (after 40 minutes). Although I see that both RES and SHR counters sometimes increase synchronously. lrmd does not grow at all. Will look again after few hours. So, lrmd is ok. I see only 4kb growth in RES-SHR on one node (current DC). Other instances are of the constant size for almost a day. I see RES-SHR growth in pacemakerd (100kb per day). So I expect some leakage here. Should I run it under valgrind? Valgrind doesn't find anything valuable here (1 and 9 hours runs). ==23851== LEAK SUMMARY: ==23851==definitely lost: 528 bytes in 3 blocks ==23851==indirectly lost: 17,361 bytes in 36 blocks ==23851== possibly lost: 234 bytes in 8 blocks ==23851==still reachable: 17,458 bytes in 163 blocks ==23851== suppressed: 0 bytes in 0 blocks And I see that both RES and SHR synchronously grow in crmd (600-700kb per day on member nodes, 6Mb on DC), while RES-SHR is reduced by 24kb on DC. And I see cib growth in both RES and SHR in range 12-340 kb, and 4kb growth in RES-SHR on nodes except DC. I can't say for sure what causes growth of shared pages. May be it is /dev/shm. Lot of files are there. I'll look if it grows. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pengine assert in qb_log_from_external_source()
29.11.2012 09:36, Angus Salkeld wrote: ... so, qb_array_index() fails once idx spans uint16_t boundary (0x) and (uint16_t)idx 0. IMHO this naturally means some kind of integer overflow. Well done, I'll have a closer look at it. Patch here: https://github.com/asalkeld/libqb/commit/30a7871646c1f5bbb602e0a01f5550a4516b36f8 No asserts for last 23 hours. So, issue seems to be fixed. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org