Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On 2/24/12 3:36 PM, William Seligman wrote: On 2/17/12 7:30 AM, Dejan Muhamedagic wrote: OK, I guess that'd also be doable by checking the following variables: OCF_RESKEY_CRM_meta_notify_inactive_resource (set of currently inactive instances) OCF_RESKEY_CRM_meta_notify_stop_resource (set of instances which were just stopped) Any volunteers for a patch? :) a) I have a test cluster that I can bring up and down at will; b) I'm a glutton for punishment. So I'll volunteer, since I offered to try to do something in the first place. I think I've got a handle on what to look for; e.g., one has to look for notify_type=pre and notify_operation=stop in the 'node_down' test. Here's my patch, in my usual overly-commented style. Notes: - To make this work, you need to turn on notify in the clone resources; e.g., clone ipaddr2_clone ipaddr2_resource meta notify=true None of the clone examples I saw in the documentation (Clusters From Scratch, Pacemaker Explained) show the notify option; only the ms examples do. You may want to revise the documentation with an IPaddr2 example. - I tested this with my two-node cluster, and it works. I wrote it for a multi-node cluster, but I can't be sure it will work for more than two nodes. Would some nice person test this? - I wrote my code assuming that the clone number assigned to a node would remain constant. If the clone numbers were to change by deleting/adding a node to the cluster, I don't know what would happen. Enjoy! -- Bill Seligman | Phone: (914) 591-2823 Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu PO Box 137| Irvington NY 10533 USA| http://www.nevis.columbia.edu/~seligman/ --- IPaddr2.ori 2012-02-16 11:51:04.942688344 -0500 +++ /usr/lib/ocf/resource.d/heartbeat/IPaddr2 2012-02-27 15:23:46.856510474 -0500 @@ -13,6 +13,7 @@ # Copyright (c) 2003 Tuomo Soini # Copyright (c) 2004-2006 SUSE LINUX AG, Lars Marowsky-Brée #All Rights Reserved. +# Additions for high availability 2012 by William Seligman # # This program is free software; you can redistribute it and/or modify # it under the terms of version 2 of the GNU General Public License as @@ -86,7 +87,7 @@ This Linux-specific resource manages IP alias IP addresses. It can add an IP alias, or remove one. In addition, it can implement Cluster Alias IP functionality -if invoked as a clone resource. +if invoked as a clone resource with 'meta notify=true'. /longdesc shortdesc lang=enManages virtual IPv4 addresses (Linux specific version)/shortdesc @@ -254,6 +255,7 @@ actions action name=start timeout=20s / action name=stoptimeout=20s / +action name=notify timeout=20s / action name=status depth=0 timeout=20s interval=10s / action name=monitor depth=0 timeout=20s interval=10s / action name=meta-data timeout=5s / @@ -849,6 +851,101 @@ fi } +# Make the IPaddr2 resource highly-available by adjusting the iptables +# information if nodes drop out of the cluster. +handle_notify() { + # If this is not a cloned IPaddr2 resource, do nothing. + # (But if it's not cloned, how did the user set 'meta notify=true'?) + if [ $IP_INC_GLOBAL -eq 0 ]; then + ocf_log info notify action on non-cloned resource; remove meta notify='true' + exit $OCF_SUCCESS + fi + + # To test if nodes are dropped, the best flags are when notify_type=pre and + # notify_operation=stop. You might not get post/stop if a node is fenced. + if [ x$OCF_RESKEY_CRM_meta_notify_type = xpre ] [ x$OCF_RESKEY_CRM_meta_notify_operation = xstop ]; then + + # The stopping nodes will still be included in the + # active_resource list, so we have to remove them. + local active=$OCF_RESKEY_CRM_meta_notify_active_resource + for stopping in $OCF_RESKEY_CRM_meta_notify_stop_resource + do + # Sanity check: If the user has done a crm node standby, then + # this method can be called by the node that's stopping. + local stopping_clone=`echo ${stopping} | sed s/[^[:space:]]\+://` + if [ ${stopping_clone} -eq $OCF_RESKEY_CRM_meta_clone ]; then + exit $OCF_SUCCESS + fi + + # We're sane, so remove the stopping node from the active list. + active=`echo ${active} | sed s/${stopping}//` + done + + # One of the remaining nodes has to take over the job of the dropped + # node(s). I'm doing the simplest thing, and choose the last + # node in the list of active resources. active_resource is a list like + # name:0 name:1 name:2. + local selected_node=`echo ${active} | sed
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On Mon, Feb 27, 2012 at 03:39:04PM -0500, William Seligman wrote: On 2/24/12 3:36 PM, William Seligman wrote: On 2/17/12 7:30 AM, Dejan Muhamedagic wrote: OK, I guess that'd also be doable by checking the following variables: OCF_RESKEY_CRM_meta_notify_inactive_resource (set of currently inactive instances) OCF_RESKEY_CRM_meta_notify_stop_resource (set of instances which were just stopped) Any volunteers for a patch? :) a) I have a test cluster that I can bring up and down at will; b) I'm a glutton for punishment. So I'll volunteer, since I offered to try to do something in the first place. I think I've got a handle on what to look for; e.g., one has to look for notify_type=pre and notify_operation=stop in the 'node_down' test. Here's my patch, in my usual overly-commented style. Sorry, I may be missing something obvious, but... is this not *the* use case of globally-unique=true? Which makes it possible to set clone-node-max = clone-max = number of nodes? Or even 7 times (or whatever) number of nodes. And all the iptables magic is in the start operation. If one of the nodes fails, it's bucket(s) will be re-allocated to the surviving nodes. And that is all fully implemented already (at least that's how I read the script). What is not implemented is chaning the number of buckets aka clone-max, without restarting clones. No need for fancy stuff in *pre* notifications, which are only statements of intent; the actual action may still fail, and all will be different than you anticipated. Notes: - To make this work, you need to turn on notify in the clone resources; e.g., clone ipaddr2_clone ipaddr2_resource meta notify=true None of the clone examples I saw in the documentation (Clusters From Scratch, Pacemaker Explained) show the notify option; only the ms examples do. You may want to revise the documentation with an IPaddr2 example. - I tested this with my two-node cluster, and it works. I wrote it for a multi-node cluster, but I can't be sure it will work for more than two nodes. Would some nice person test this? - I wrote my code assuming that the clone number assigned to a node would remain constant. If the clone numbers were to change by deleting/adding a node to the cluster, I don't know what would happen. For anonymous clones, it can be relabeled. In fact, there are plans to remove the clone number from anonymous clones completely. However, for globally unique clones, the clone number is part of its identifier. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On 2/27/12 4:10 PM, Lars Ellenberg wrote: On Mon, Feb 27, 2012 at 03:39:04PM -0500, William Seligman wrote: On 2/24/12 3:36 PM, William Seligman wrote: On 2/17/12 7:30 AM, Dejan Muhamedagic wrote: OK, I guess that'd also be doable by checking the following variables: OCF_RESKEY_CRM_meta_notify_inactive_resource (set of currently inactive instances) OCF_RESKEY_CRM_meta_notify_stop_resource (set of instances which were just stopped) Any volunteers for a patch? :) a) I have a test cluster that I can bring up and down at will; b) I'm a glutton for punishment. So I'll volunteer, since I offered to try to do something in the first place. I think I've got a handle on what to look for; e.g., one has to look for notify_type=pre and notify_operation=stop in the 'node_down' test. Here's my patch, in my usual overly-commented style. Sorry, I may be missing something obvious, but... is this not *the* use case of globally-unique=true? I did not know about globally-unique. I just tested it, replacing (with name substitutions): clone ipaddr2_clone ipaddr2_resource meta notify=true with clone ipaddr2_clone ipaddr2_resource meta globally-unique=true This fell back to the old behavior I described in the first message in this thread: iptables did not update when I took down one of my nodes. I expected this, since according to Pacemaker Explained, globally-unique=true is the default. If this had worked, I never would have reported the problem in the first place. Is there something else that could suppress the behavior you described for globally-unique=true? Which makes it possible to set clone-node-max = clone-max = number of nodes? Or even 7 times (or whatever) number of nodes. And all the iptables magic is in the start operation. If one of the nodes fails, it's bucket(s) will be re-allocated to the surviving nodes. And that is all fully implemented already (at least that's how I read the script). What is not implemented is chaning the number of buckets aka clone-max, without restarting clones. No need for fancy stuff in *pre* notifications, which are only statements of intent; the actual action may still fail, and all will be different than you anticipated. Notes: - To make this work, you need to turn on notify in the clone resources; e.g., clone ipaddr2_clone ipaddr2_resource meta notify=true None of the clone examples I saw in the documentation (Clusters From Scratch, Pacemaker Explained) show the notify option; only the ms examples do. You may want to revise the documentation with an IPaddr2 example. - I tested this with my two-node cluster, and it works. I wrote it for a multi-node cluster, but I can't be sure it will work for more than two nodes. Would some nice person test this? - I wrote my code assuming that the clone number assigned to a node would remain constant. If the clone numbers were to change by deleting/adding a node to the cluster, I don't know what would happen. For anonymous clones, it can be relabeled. In fact, there are plans to remove the clone number from anonymous clones completely. However, for globally unique clones, the clone number is part of its identifier. -- Bill Seligman | Phone: (914) 591-2823 Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu PO Box 137| Irvington NY 10533 USA| http://www.nevis.columbia.edu/~seligman/ smime.p7s Description: S/MIME Cryptographic Signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On Mon, Feb 27, 2012 at 05:23:36PM -0500, William Seligman wrote: On 2/27/12 4:10 PM, Lars Ellenberg wrote: On Mon, Feb 27, 2012 at 03:39:04PM -0500, William Seligman wrote: On 2/24/12 3:36 PM, William Seligman wrote: On 2/17/12 7:30 AM, Dejan Muhamedagic wrote: OK, I guess that'd also be doable by checking the following variables: OCF_RESKEY_CRM_meta_notify_inactive_resource (set of currently inactive instances) OCF_RESKEY_CRM_meta_notify_stop_resource (set of instances which were just stopped) Any volunteers for a patch? :) a) I have a test cluster that I can bring up and down at will; b) I'm a glutton for punishment. So I'll volunteer, since I offered to try to do something in the first place. I think I've got a handle on what to look for; e.g., one has to look for notify_type=pre and notify_operation=stop in the 'node_down' test. Here's my patch, in my usual overly-commented style. Sorry, I may be missing something obvious, but... is this not *the* use case of globally-unique=true? I did not know about globally-unique. I just tested it, replacing (with name substitutions): clone ipaddr2_clone ipaddr2_resource meta notify=true with clone ipaddr2_clone ipaddr2_resource meta globally-unique=true This fell back to the old behavior I described in the first message in this thread: iptables did not update when I took down one of my nodes. I expected this, since according to Pacemaker Explained, globally-unique=true is the default. If this had worked, I never would have reported the problem in the first place. Is there something else that could suppress the behavior you described for globally-unique=true? You need clone-node-max == clone-max. It defaults to 1. Which obviously prevents nodes already running one instance from taking over an other... -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On 2/27/12 5:33 PM, Lars Ellenberg wrote: On Mon, Feb 27, 2012 at 05:23:36PM -0500, William Seligman wrote: On 2/27/12 4:10 PM, Lars Ellenberg wrote: On Mon, Feb 27, 2012 at 03:39:04PM -0500, William Seligman wrote: On 2/24/12 3:36 PM, William Seligman wrote: On 2/17/12 7:30 AM, Dejan Muhamedagic wrote: OK, I guess that'd also be doable by checking the following variables: OCF_RESKEY_CRM_meta_notify_inactive_resource (set of currently inactive instances) OCF_RESKEY_CRM_meta_notify_stop_resource (set of instances which were just stopped) Any volunteers for a patch? :) a) I have a test cluster that I can bring up and down at will; b) I'm a glutton for punishment. So I'll volunteer, since I offered to try to do something in the first place. I think I've got a handle on what to look for; e.g., one has to look for notify_type=pre and notify_operation=stop in the 'node_down' test. Here's my patch, in my usual overly-commented style. Sorry, I may be missing something obvious, but... is this not *the* use case of globally-unique=true? I did not know about globally-unique. I just tested it, replacing (with name substitutions): clone ipaddr2_clone ipaddr2_resource meta notify=true with clone ipaddr2_clone ipaddr2_resource meta globally-unique=true This fell back to the old behavior I described in the first message in this thread: iptables did not update when I took down one of my nodes. I expected this, since according to Pacemaker Explained, globally-unique=true is the default. If this had worked, I never would have reported the problem in the first place. Is there something else that could suppress the behavior you described for globally-unique=true? You need clone-node-max == clone-max. It defaults to 1. Which obviously prevents nodes already running one instance from taking over an other... I tried it, and it works. So there's no need for my patch. The magic invocation for a highly-available IPaddr2 resource is: ip_clone ip_resource meta clone-max=2 clone-node-max=2 Could this please be documented more clearly somewhere? -- Bill Seligman | Phone: (914) 591-2823 Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu PO Box 137| Irvington NY 10533 USA| http://www.nevis.columbia.edu/~seligman/ smime.p7s Description: S/MIME Cryptographic Signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On 2/27/12 5:41 PM, William Seligman wrote: On 2/27/12 5:33 PM, Lars Ellenberg wrote: On Mon, Feb 27, 2012 at 05:23:36PM -0500, William Seligman wrote: On 2/27/12 4:10 PM, Lars Ellenberg wrote: On Mon, Feb 27, 2012 at 03:39:04PM -0500, William Seligman wrote: On 2/24/12 3:36 PM, William Seligman wrote: On 2/17/12 7:30 AM, Dejan Muhamedagic wrote: OK, I guess that'd also be doable by checking the following variables: OCF_RESKEY_CRM_meta_notify_inactive_resource (set of currently inactive instances) OCF_RESKEY_CRM_meta_notify_stop_resource (set of instances which were just stopped) Any volunteers for a patch? :) a) I have a test cluster that I can bring up and down at will; b) I'm a glutton for punishment. So I'll volunteer, since I offered to try to do something in the first place. I think I've got a handle on what to look for; e.g., one has to look for notify_type=pre and notify_operation=stop in the 'node_down' test. Here's my patch, in my usual overly-commented style. Sorry, I may be missing something obvious, but... is this not *the* use case of globally-unique=true? I did not know about globally-unique. I just tested it, replacing (with name substitutions): clone ipaddr2_clone ipaddr2_resource meta notify=true with clone ipaddr2_clone ipaddr2_resource meta globally-unique=true This fell back to the old behavior I described in the first message in this thread: iptables did not update when I took down one of my nodes. I expected this, since according to Pacemaker Explained, globally-unique=true is the default. If this had worked, I never would have reported the problem in the first place. Is there something else that could suppress the behavior you described for globally-unique=true? You need clone-node-max == clone-max. It defaults to 1. Which obviously prevents nodes already running one instance from taking over an other... I tried it, and it works. So there's no need for my patch. The magic invocation for a highly-available IPaddr2 resource is: ip_clone ip_resource meta clone-max=2 clone-node-max=2 Could this please be documented more clearly somewhere? Umm... it turns out to be: ip_clone ip_resource meta globally-unique=true clone-max=2 clone-node-max=2 and for a two-node cluster, of course. So I guess globally-unique=true is not the default after all. -- Bill Seligman | Phone: (914) 591-2823 Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu PO Box 137| Irvington NY 10533 USA| http://www.nevis.columbia.edu/~seligman/ smime.p7s Description: S/MIME Cryptographic Signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On Mon, Feb 27, 2012 at 05:41:09PM -0500, William Seligman wrote: is this not *the* use case of globally-unique=true? I did not know about globally-unique. I just tested it, replacing (with name substitutions): clone ipaddr2_clone ipaddr2_resource meta notify=true with clone ipaddr2_clone ipaddr2_resource meta globally-unique=true This fell back to the old behavior I described in the first message in this thread: iptables did not update when I took down one of my nodes. I expected this, since according to Pacemaker Explained, globally-unique=true is the default. If this had worked, I never would have reported the problem in the first place. Is there something else that could suppress the behavior you described for globally-unique=true? You need clone-node-max == clone-max. It defaults to 1. Which obviously prevents nodes already running one instance from taking over an other... I tried it, and it works. So there's no need for my patch. The magic invocation for a highly-available IPaddr2 resource is: ip_clone ip_resource meta clone-max=2 clone-node-max=2 Note that, if you have more than two nodes, to get more evenly distributed buckets in the case of failover, you can also specify larger numbers than you have nodes. In which case by default, all nodes would run several. And in case of failover, each remaining node should takeover it's share. Could this please be documented more clearly somewhere? Clusters from Scratch not good enought? http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08s06.html But yes, I'll add a note to the IPaddr2 meta data where the long desc talks about cluster IP usage... -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On 2/16/12 11:14 PM, William Seligman wrote: On 2/16/12 8:13 PM, Andrew Beekhof wrote: On Fri, Feb 17, 2012 at 5:05 AM, Dejan Muhamedagicdeja...@fastmail.fm wrote: On Wed, Feb 15, 2012 at 04:24:15PM -0500, William Seligman wrote: On 2/10/12 4:53 PM, William Seligman wrote: I'm trying to set up an Active/Active cluster (yes, I hear the sounds of kittens dying). Versions: Scientific Linux 6.2 pacemaker-1.1.6 resource-agents-3.9.2 I'm using cloned IPaddr2 resources: primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=129.236.252.13 cidr_netmask=32 \ op monitor interval=30s primitive ClusterIPLocal ocf:heartbeat:IPaddr2 \ params ip=10.44.7.13 cidr_netmask=32 \ op monitor interval=31s primitive ClusterIPSandbox ocf:heartbeat:IPaddr2 \ params ip=10.43.7.13 cidr_netmask=32 \ op monitor interval=32s group ClusterIPGroup ClusterIP ClusterIPLocal ClusterIPSandbox clone ClusterIPClone ClusterIPGroup When both nodes of my two-node cluster are running, everything looks and functions OK. From service iptables status on node 1 (hypatia-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 On node 2 (orestes-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=2 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=2 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=2 hash_init=0 If I do a simple test of ssh'ing into 129.236.252.13, I see that I alternately login into hypatia-tb and orestes-tb. All is good. Now take orestes-tb offline. The iptables rules on hypatia-tb are unchanged: 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 If I attempt to ssh to 129.236.252.13, whether or not I get in seems to be machine-dependent. On one machine I get in, from another I get a time-out. Both machines show the same MAC address for 129.236.252.13: arp 129.236.252.13 Address HWtype HWaddress Flags Mask Iface hamilton-tb.nevis.colum ether B1:95:5A:B5:16:79 C eth0 Is this the way the cloned IPaddr2 resource is supposed to behave in the event of a node failure, or have I set things up incorrectly? I spent some time looking over the IPaddr2 script. As far as I can tell, the script has no mechanism for reconfiguring iptables in the event of a change of state in the number of clones. I might be stupid -- er -- dedicated enough to make this change on my own, then share the code with the appropriate group. The change seems to be relatively simple. It would be in the monitor operation. In pseudo-code: if (IPaddr2 resource is already started ) then if ( OCF_RESKEY_CRM_meta_clone_max != OCF_RESKEY_CRM_meta_clone_max last time || OCF_RESKEY_CRM_meta_clone != OCF_RESKEY_CRM_meta_clone last time ) ip_stop ip_start Just changing the iptables entries should suffice, right? Besides, doing stop/start in the monitor is sort of unexpected. Another option is to add the missing node to one of the nodes which are still running (echo +n /proc/net/ipt_CLUSTERIP/ip). But any of that would be extremely tricky to implement properly (if not impossible). fi fi If this would work, then I'd have two questions for the experts: - Would the values of OCF_RESKEY_CRM_meta_clone_max and/or OCF_RESKEY_CRM_meta_clone change if the number of cloned copies of a resource changed? OCF_RESKEY_CRM_meta_clone_max definitely not. OCF_RESKEY_CRM_meta_clone may change but also probably not; it's just a clone sequence number. In short, there's no way to figure out the total number of clones by examining the environment.
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On Sat, Feb 25, 2012 at 6:39 AM, William Seligman selig...@nevis.columbia.edu wrote: At this point, it looks my notion of re-writing IPaddr2 won't work. I'm redesigning my cluster configuration so I don't require cloned/highly-available IP addresses. Is this a bug? Is there a bugzilla or similar resource for resource agents? I did file a bug report, though for some reason my searches turn up nothing. Would whoever manages such things respond to the notify doesn't work part of the post with user doesn't know what he is doing or whatever is relevant. chuckle. will do :) ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On 2/17/12 7:30 AM, Dejan Muhamedagic wrote: On Fri, Feb 17, 2012 at 01:15:04PM +0100, Dejan Muhamedagic wrote: On Fri, Feb 17, 2012 at 12:13:49PM +1100, Andrew Beekhof wrote: [...] What about notifications? The would be the right point to re-configure things I'd have thought. Sounds like the right way. Still, it may be hard to coordinate between different instances. Unless we figure out how to map nodes to numbers used by the CLUSTERIP. For instance, the notify operation gets: OCF_RESKEY_CRM_meta_notify_stop_resource=ip_lb:2 OCF_RESKEY_CRM_meta_notify_stop_uname=xen-f But the instance number may not match the node number from Scratch that. IP_CIP_FILE=/proc/net/ipt_CLUSTERIP/$OCF_RESKEY_ip IP_INC_NO=`expr ${OCF_RESKEY_CRM_meta_clone:-0} + 1` ... echo +$IP_INC_NO $IP_CIP_FILE /proc/net/ipt_CLUSTERIP/ip and that's where we should add the node. It should be something like: notify() { if node_down; then echo +node_num /proc/net/ipt_CLUSTERIP/ip elif node_up; then echo -node_num /proc/net/ipt_CLUSTERIP/ip fi } Another issue is that the above code should be executed on _exactly_ one node. OK, I guess that'd also be doable by checking the following variables: OCF_RESKEY_CRM_meta_notify_inactive_resource (set of currently inactive instances) OCF_RESKEY_CRM_meta_notify_stop_resource (set of instances which were just stopped) Any volunteers for a patch? :) a) I have a test cluster that I can bring up and down at will; b) I'm a glutton for punishment. So I'll volunteer, since I offered to try to do something in the first place. I think I've got a handle on what to look for; e.g., one has to look for notify_type=pre and notify_operation=stop in the 'node_down' test. -- Bill Seligman | Phone: (914) 591-2823 Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu PO Box 137| Irvington NY 10533 USA| http://www.nevis.columbia.edu/~seligman/ smime.p7s Description: S/MIME Cryptographic Signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On Thu, Feb 16, 2012 at 11:14:37PM -0500, William Seligman wrote: On 2/16/12 8:13 PM, Andrew Beekhof wrote: On Fri, Feb 17, 2012 at 5:05 AM, Dejan Muhamedagicdeja...@fastmail.fm wrote: Hi, On Wed, Feb 15, 2012 at 04:24:15PM -0500, William Seligman wrote: On 2/10/12 4:53 PM, William Seligman wrote: I'm trying to set up an Active/Active cluster (yes, I hear the sounds of kittens dying). Versions: Scientific Linux 6.2 pacemaker-1.1.6 resource-agents-3.9.2 I'm using cloned IPaddr2 resources: primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=129.236.252.13 cidr_netmask=32 \ op monitor interval=30s primitive ClusterIPLocal ocf:heartbeat:IPaddr2 \ params ip=10.44.7.13 cidr_netmask=32 \ op monitor interval=31s primitive ClusterIPSandbox ocf:heartbeat:IPaddr2 \ params ip=10.43.7.13 cidr_netmask=32 \ op monitor interval=32s group ClusterIPGroup ClusterIP ClusterIPLocal ClusterIPSandbox clone ClusterIPClone ClusterIPGroup When both nodes of my two-node cluster are running, everything looks and functions OK. From service iptables status on node 1 (hypatia-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 On node 2 (orestes-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=2 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=2 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=2 hash_init=0 If I do a simple test of ssh'ing into 129.236.252.13, I see that I alternately login into hypatia-tb and orestes-tb. All is good. Now take orestes-tb offline. The iptables rules on hypatia-tb are unchanged: 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 If I attempt to ssh to 129.236.252.13, whether or not I get in seems to be machine-dependent. On one machine I get in, from another I get a time-out. Both machines show the same MAC address for 129.236.252.13: arp 129.236.252.13 Address HWtype HWaddress Flags Mask Iface hamilton-tb.nevis.colum ether B1:95:5A:B5:16:79 C eth0 Is this the way the cloned IPaddr2 resource is supposed to behave in the event of a node failure, or have I set things up incorrectly? I spent some time looking over the IPaddr2 script. As far as I can tell, the script has no mechanism for reconfiguring iptables in the event of a change of state in the number of clones. I might be stupid -- er -- dedicated enough to make this change on my own, then share the code with the appropriate group. The change seems to be relatively simple. It would be in the monitor operation. In pseudo-code: if (IPaddr2 resource is already started ) then if ( OCF_RESKEY_CRM_meta_clone_max != OCF_RESKEY_CRM_meta_clone_max last time || OCF_RESKEY_CRM_meta_clone != OCF_RESKEY_CRM_meta_clone last time ) ip_stop ip_start Just changing the iptables entries should suffice, right? Besides, doing stop/start in the monitor is sort of unexpected. Another option is to add the missing node to one of the nodes which are still running (echo +n /proc/net/ipt_CLUSTERIP/ip). But any of that would be extremely tricky to implement properly (if not impossible). fi fi If this would work, then I'd have two questions for the experts: - Would the values of OCF_RESKEY_CRM_meta_clone_max and/or OCF_RESKEY_CRM_meta_clone change if the number of cloned copies of a resource changed? OCF_RESKEY_CRM_meta_clone_max definitely not. OCF_RESKEY_CRM_meta_clone may change but also probably not; it's just a clone sequence number. In short, there's no way to figure out the
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On Fri, Feb 17, 2012 at 12:13:49PM +1100, Andrew Beekhof wrote: On Fri, Feb 17, 2012 at 5:05 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Wed, Feb 15, 2012 at 04:24:15PM -0500, William Seligman wrote: On 2/10/12 4:53 PM, William Seligman wrote: I'm trying to set up an Active/Active cluster (yes, I hear the sounds of kittens dying). Versions: Scientific Linux 6.2 pacemaker-1.1.6 resource-agents-3.9.2 I'm using cloned IPaddr2 resources: primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=129.236.252.13 cidr_netmask=32 \ op monitor interval=30s primitive ClusterIPLocal ocf:heartbeat:IPaddr2 \ params ip=10.44.7.13 cidr_netmask=32 \ op monitor interval=31s primitive ClusterIPSandbox ocf:heartbeat:IPaddr2 \ params ip=10.43.7.13 cidr_netmask=32 \ op monitor interval=32s group ClusterIPGroup ClusterIP ClusterIPLocal ClusterIPSandbox clone ClusterIPClone ClusterIPGroup When both nodes of my two-node cluster are running, everything looks and functions OK. From service iptables status on node 1 (hypatia-tb): 5 CLUSTERIP all -- 0.0.0.0/0 10.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6 CLUSTERIP all -- 0.0.0.0/0 10.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7 CLUSTERIP all -- 0.0.0.0/0 129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 On node 2 (orestes-tb): 5 CLUSTERIP all -- 0.0.0.0/0 10.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=2 hash_init=0 6 CLUSTERIP all -- 0.0.0.0/0 10.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=2 hash_init=0 7 CLUSTERIP all -- 0.0.0.0/0 129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=2 hash_init=0 If I do a simple test of ssh'ing into 129.236.252.13, I see that I alternately login into hypatia-tb and orestes-tb. All is good. Now take orestes-tb offline. The iptables rules on hypatia-tb are unchanged: 5 CLUSTERIP all -- 0.0.0.0/0 10.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6 CLUSTERIP all -- 0.0.0.0/0 10.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7 CLUSTERIP all -- 0.0.0.0/0 129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 If I attempt to ssh to 129.236.252.13, whether or not I get in seems to be machine-dependent. On one machine I get in, from another I get a time-out. Both machines show the same MAC address for 129.236.252.13: arp 129.236.252.13 Address HWtype HWaddress Flags Mask Iface hamilton-tb.nevis.colum ether B1:95:5A:B5:16:79 C eth0 Is this the way the cloned IPaddr2 resource is supposed to behave in the event of a node failure, or have I set things up incorrectly? I spent some time looking over the IPaddr2 script. As far as I can tell, the script has no mechanism for reconfiguring iptables in the event of a change of state in the number of clones. I might be stupid -- er -- dedicated enough to make this change on my own, then share the code with the appropriate group. The change seems to be relatively simple. It would be in the monitor operation. In pseudo-code: if ( IPaddr2 resource is already started ) then if ( OCF_RESKEY_CRM_meta_clone_max != OCF_RESKEY_CRM_meta_clone_max last time || OCF_RESKEY_CRM_meta_clone != OCF_RESKEY_CRM_meta_clone last time ) ip_stop ip_start Just changing the iptables entries should suffice, right? Besides, doing stop/start in the monitor is sort of unexpected. Another option is to add the missing node to one of the nodes which are still running (echo +n /proc/net/ipt_CLUSTERIP/ip). But any of that would be extremely tricky to implement properly (if not impossible). fi fi If this would work, then I'd have two questions for the experts: - Would the values of OCF_RESKEY_CRM_meta_clone_max and/or OCF_RESKEY_CRM_meta_clone change if the number of cloned copies of a resource changed? OCF_RESKEY_CRM_meta_clone_max definitely not.
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On Fri, Feb 17, 2012 at 01:15:04PM +0100, Dejan Muhamedagic wrote: On Fri, Feb 17, 2012 at 12:13:49PM +1100, Andrew Beekhof wrote: [...] What about notifications? The would be the right point to re-configure things I'd have thought. Sounds like the right way. Still, it may be hard to coordinate between different instances. Unless we figure out how to map nodes to numbers used by the CLUSTERIP. For instance, the notify operation gets: OCF_RESKEY_CRM_meta_notify_stop_resource=ip_lb:2 OCF_RESKEY_CRM_meta_notify_stop_uname=xen-f But the instance number may not match the node number from Scratch that. IP_CIP_FILE=/proc/net/ipt_CLUSTERIP/$OCF_RESKEY_ip IP_INC_NO=`expr ${OCF_RESKEY_CRM_meta_clone:-0} + 1` ... echo +$IP_INC_NO $IP_CIP_FILE /proc/net/ipt_CLUSTERIP/ip and that's where we should add the node. It should be something like: notify() { if node_down; then echo +node_num /proc/net/ipt_CLUSTERIP/ip elif node_up; then echo -node_num /proc/net/ipt_CLUSTERIP/ip fi } Another issue is that the above code should be executed on _exactly_ one node. OK, I guess that'd also be doable by checking the following variables: OCF_RESKEY_CRM_meta_notify_inactive_resource (set of currently inactive instances) OCF_RESKEY_CRM_meta_notify_stop_resource (set of instances which were just stopped) Any volunteers for a patch? :) Thanks, Dejan Cheers, Dejan ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
Hi, On Wed, Feb 15, 2012 at 04:24:15PM -0500, William Seligman wrote: On 2/10/12 4:53 PM, William Seligman wrote: I'm trying to set up an Active/Active cluster (yes, I hear the sounds of kittens dying). Versions: Scientific Linux 6.2 pacemaker-1.1.6 resource-agents-3.9.2 I'm using cloned IPaddr2 resources: primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=129.236.252.13 cidr_netmask=32 \ op monitor interval=30s primitive ClusterIPLocal ocf:heartbeat:IPaddr2 \ params ip=10.44.7.13 cidr_netmask=32 \ op monitor interval=31s primitive ClusterIPSandbox ocf:heartbeat:IPaddr2 \ params ip=10.43.7.13 cidr_netmask=32 \ op monitor interval=32s group ClusterIPGroup ClusterIP ClusterIPLocal ClusterIPSandbox clone ClusterIPClone ClusterIPGroup When both nodes of my two-node cluster are running, everything looks and functions OK. From service iptables status on node 1 (hypatia-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 On node 2 (orestes-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=2 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=2 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=2 hash_init=0 If I do a simple test of ssh'ing into 129.236.252.13, I see that I alternately login into hypatia-tb and orestes-tb. All is good. Now take orestes-tb offline. The iptables rules on hypatia-tb are unchanged: 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 If I attempt to ssh to 129.236.252.13, whether or not I get in seems to be machine-dependent. On one machine I get in, from another I get a time-out. Both machines show the same MAC address for 129.236.252.13: arp 129.236.252.13 Address HWtype HWaddress Flags Mask Iface hamilton-tb.nevis.colum ether B1:95:5A:B5:16:79 C eth0 Is this the way the cloned IPaddr2 resource is supposed to behave in the event of a node failure, or have I set things up incorrectly? I spent some time looking over the IPaddr2 script. As far as I can tell, the script has no mechanism for reconfiguring iptables in the event of a change of state in the number of clones. I might be stupid -- er -- dedicated enough to make this change on my own, then share the code with the appropriate group. The change seems to be relatively simple. It would be in the monitor operation. In pseudo-code: if ( IPaddr2 resource is already started ) then if ( OCF_RESKEY_CRM_meta_clone_max != OCF_RESKEY_CRM_meta_clone_max last time || OCF_RESKEY_CRM_meta_clone != OCF_RESKEY_CRM_meta_clone last time ) ip_stop ip_start Just changing the iptables entries should suffice, right? Besides, doing stop/start in the monitor is sort of unexpected. Another option is to add the missing node to one of the nodes which are still running (echo +n /proc/net/ipt_CLUSTERIP/ip). But any of that would be extremely tricky to implement properly (if not impossible). fi fi If this would work, then I'd have two questions for the experts: - Would the values of OCF_RESKEY_CRM_meta_clone_max and/or OCF_RESKEY_CRM_meta_clone change if the number of cloned copies of a resource changed? OCF_RESKEY_CRM_meta_clone_max definitely not. OCF_RESKEY_CRM_meta_clone may change but also probably not; it's just a clone sequence number. In short, there's no way to figure out the total number of clones by examining the environment. Information such as membership changes doesn't trickle down to the resource instances. Of course, it's possible to find
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On Fri, Feb 17, 2012 at 5:05 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Wed, Feb 15, 2012 at 04:24:15PM -0500, William Seligman wrote: On 2/10/12 4:53 PM, William Seligman wrote: I'm trying to set up an Active/Active cluster (yes, I hear the sounds of kittens dying). Versions: Scientific Linux 6.2 pacemaker-1.1.6 resource-agents-3.9.2 I'm using cloned IPaddr2 resources: primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=129.236.252.13 cidr_netmask=32 \ op monitor interval=30s primitive ClusterIPLocal ocf:heartbeat:IPaddr2 \ params ip=10.44.7.13 cidr_netmask=32 \ op monitor interval=31s primitive ClusterIPSandbox ocf:heartbeat:IPaddr2 \ params ip=10.43.7.13 cidr_netmask=32 \ op monitor interval=32s group ClusterIPGroup ClusterIP ClusterIPLocal ClusterIPSandbox clone ClusterIPClone ClusterIPGroup When both nodes of my two-node cluster are running, everything looks and functions OK. From service iptables status on node 1 (hypatia-tb): 5 CLUSTERIP all -- 0.0.0.0/0 10.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6 CLUSTERIP all -- 0.0.0.0/0 10.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7 CLUSTERIP all -- 0.0.0.0/0 129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 On node 2 (orestes-tb): 5 CLUSTERIP all -- 0.0.0.0/0 10.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=2 hash_init=0 6 CLUSTERIP all -- 0.0.0.0/0 10.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=2 hash_init=0 7 CLUSTERIP all -- 0.0.0.0/0 129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=2 hash_init=0 If I do a simple test of ssh'ing into 129.236.252.13, I see that I alternately login into hypatia-tb and orestes-tb. All is good. Now take orestes-tb offline. The iptables rules on hypatia-tb are unchanged: 5 CLUSTERIP all -- 0.0.0.0/0 10.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6 CLUSTERIP all -- 0.0.0.0/0 10.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7 CLUSTERIP all -- 0.0.0.0/0 129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 If I attempt to ssh to 129.236.252.13, whether or not I get in seems to be machine-dependent. On one machine I get in, from another I get a time-out. Both machines show the same MAC address for 129.236.252.13: arp 129.236.252.13 Address HWtype HWaddress Flags Mask Iface hamilton-tb.nevis.colum ether B1:95:5A:B5:16:79 C eth0 Is this the way the cloned IPaddr2 resource is supposed to behave in the event of a node failure, or have I set things up incorrectly? I spent some time looking over the IPaddr2 script. As far as I can tell, the script has no mechanism for reconfiguring iptables in the event of a change of state in the number of clones. I might be stupid -- er -- dedicated enough to make this change on my own, then share the code with the appropriate group. The change seems to be relatively simple. It would be in the monitor operation. In pseudo-code: if ( IPaddr2 resource is already started ) then if ( OCF_RESKEY_CRM_meta_clone_max != OCF_RESKEY_CRM_meta_clone_max last time || OCF_RESKEY_CRM_meta_clone != OCF_RESKEY_CRM_meta_clone last time ) ip_stop ip_start Just changing the iptables entries should suffice, right? Besides, doing stop/start in the monitor is sort of unexpected. Another option is to add the missing node to one of the nodes which are still running (echo +n /proc/net/ipt_CLUSTERIP/ip). But any of that would be extremely tricky to implement properly (if not impossible). fi fi If this would work, then I'd have two questions for the experts: - Would the values of OCF_RESKEY_CRM_meta_clone_max and/or OCF_RESKEY_CRM_meta_clone change if the number of cloned copies of a resource changed? OCF_RESKEY_CRM_meta_clone_max definitely not. OCF_RESKEY_CRM_meta_clone may change but also probably not; it's just a clone sequence number. In short, there's no way to figure out the total number of clones by examining the environment. Information such as membership changes
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On 2/16/12 8:13 PM, Andrew Beekhof wrote: On Fri, Feb 17, 2012 at 5:05 AM, Dejan Muhamedagicdeja...@fastmail.fm wrote: Hi, On Wed, Feb 15, 2012 at 04:24:15PM -0500, William Seligman wrote: On 2/10/12 4:53 PM, William Seligman wrote: I'm trying to set up an Active/Active cluster (yes, I hear the sounds of kittens dying). Versions: Scientific Linux 6.2 pacemaker-1.1.6 resource-agents-3.9.2 I'm using cloned IPaddr2 resources: primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=129.236.252.13 cidr_netmask=32 \ op monitor interval=30s primitive ClusterIPLocal ocf:heartbeat:IPaddr2 \ params ip=10.44.7.13 cidr_netmask=32 \ op monitor interval=31s primitive ClusterIPSandbox ocf:heartbeat:IPaddr2 \ params ip=10.43.7.13 cidr_netmask=32 \ op monitor interval=32s group ClusterIPGroup ClusterIP ClusterIPLocal ClusterIPSandbox clone ClusterIPClone ClusterIPGroup When both nodes of my two-node cluster are running, everything looks and functions OK. From service iptables status on node 1 (hypatia-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 On node 2 (orestes-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=2 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=2 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=2 hash_init=0 If I do a simple test of ssh'ing into 129.236.252.13, I see that I alternately login into hypatia-tb and orestes-tb. All is good. Now take orestes-tb offline. The iptables rules on hypatia-tb are unchanged: 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 If I attempt to ssh to 129.236.252.13, whether or not I get in seems to be machine-dependent. On one machine I get in, from another I get a time-out. Both machines show the same MAC address for 129.236.252.13: arp 129.236.252.13 Address HWtype HWaddress Flags MaskIface hamilton-tb.nevis.colum ether B1:95:5A:B5:16:79 C eth0 Is this the way the cloned IPaddr2 resource is supposed to behave in the event of a node failure, or have I set things up incorrectly? I spent some time looking over the IPaddr2 script. As far as I can tell, the script has no mechanism for reconfiguring iptables in the event of a change of state in the number of clones. I might be stupid -- er -- dedicated enough to make this change on my own, then share the code with the appropriate group. The change seems to be relatively simple. It would be in the monitor operation. In pseudo-code: if (IPaddr2 resource is already started ) then if ( OCF_RESKEY_CRM_meta_clone_max != OCF_RESKEY_CRM_meta_clone_max last time || OCF_RESKEY_CRM_meta_clone != OCF_RESKEY_CRM_meta_clone last time ) ip_stop ip_start Just changing the iptables entries should suffice, right? Besides, doing stop/start in the monitor is sort of unexpected. Another option is to add the missing node to one of the nodes which are still running (echo +n /proc/net/ipt_CLUSTERIP/ip). But any of that would be extremely tricky to implement properly (if not impossible). fi fi If this would work, then I'd have two questions for the experts: - Would the values of OCF_RESKEY_CRM_meta_clone_max and/or OCF_RESKEY_CRM_meta_clone change if the number of cloned copies of a resource changed? OCF_RESKEY_CRM_meta_clone_max definitely not. OCF_RESKEY_CRM_meta_clone may change but also probably not; it's just a clone sequence number. In short, there's no way to figure out the total number of clones by examining the environment. Information such as membership changes doesn't trickle down to the resource instances. What about notifications? The would be the right point to re-configure things
Re: [Linux-HA] Understanding the behavior of IPaddr2 clone
On 2/10/12 4:53 PM, William Seligman wrote: I'm trying to set up an Active/Active cluster (yes, I hear the sounds of kittens dying). Versions: Scientific Linux 6.2 pacemaker-1.1.6 resource-agents-3.9.2 I'm using cloned IPaddr2 resources: primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=129.236.252.13 cidr_netmask=32 \ op monitor interval=30s primitive ClusterIPLocal ocf:heartbeat:IPaddr2 \ params ip=10.44.7.13 cidr_netmask=32 \ op monitor interval=31s primitive ClusterIPSandbox ocf:heartbeat:IPaddr2 \ params ip=10.43.7.13 cidr_netmask=32 \ op monitor interval=32s group ClusterIPGroup ClusterIP ClusterIPLocal ClusterIPSandbox clone ClusterIPClone ClusterIPGroup When both nodes of my two-node cluster are running, everything looks and functions OK. From service iptables status on node 1 (hypatia-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 On node 2 (orestes-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=2 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=2 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=2 hash_init=0 If I do a simple test of ssh'ing into 129.236.252.13, I see that I alternately login into hypatia-tb and orestes-tb. All is good. Now take orestes-tb offline. The iptables rules on hypatia-tb are unchanged: 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 If I attempt to ssh to 129.236.252.13, whether or not I get in seems to be machine-dependent. On one machine I get in, from another I get a time-out. Both machines show the same MAC address for 129.236.252.13: arp 129.236.252.13 Address HWtype HWaddress Flags Mask Iface hamilton-tb.nevis.colum ether B1:95:5A:B5:16:79 C eth0 Is this the way the cloned IPaddr2 resource is supposed to behave in the event of a node failure, or have I set things up incorrectly? I spent some time looking over the IPaddr2 script. As far as I can tell, the script has no mechanism for reconfiguring iptables in the event of a change of state in the number of clones. I might be stupid -- er -- dedicated enough to make this change on my own, then share the code with the appropriate group. The change seems to be relatively simple. It would be in the monitor operation. In pseudo-code: if ( IPaddr2 resource is already started ) then if ( OCF_RESKEY_CRM_meta_clone_max != OCF_RESKEY_CRM_meta_clone_max last time || OCF_RESKEY_CRM_meta_clone != OCF_RESKEY_CRM_meta_clone last time ) ip_stop ip_start fi fi If this would work, then I'd have two questions for the experts: - Would the values of OCF_RESKEY_CRM_meta_clone_max and/or OCF_RESKEY_CRM_meta_clone change if the number of cloned copies of a resource changed? - Is there some standard mechanism by which RA scripts can maintain persistent information between successive calls? I realize there's a flaw in the logic: it risks breaking an ongoing IP connection. But as it stands, IPaddr2 is a clonable resource but not a highly-available one. If one of N cloned copies goes down, then one out of N new network connections to the IP address will fail. -- Bill Seligman | Phone: (914) 591-2823 Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu PO Box 137| Irvington NY 10533 USA| http://www.nevis.columbia.edu/~seligman/ smime.p7s Description: S/MIME Cryptographic Signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Understanding the behavior of IPaddr2 clone
I'm trying to set up an Active/Active cluster (yes, I hear the sounds of kittens dying). Versions: Scientific Linux 6.2 pacemaker-1.1.6 resource-agents-3.9.2 I'm using cloned IPaddr2 resources: primitive ClusterIP ocf:heartbeat:IPaddr2 \ params ip=129.236.252.13 cidr_netmask=32 \ op monitor interval=30s primitive ClusterIPLocal ocf:heartbeat:IPaddr2 \ params ip=10.44.7.13 cidr_netmask=32 \ op monitor interval=31s primitive ClusterIPSandbox ocf:heartbeat:IPaddr2 \ params ip=10.43.7.13 cidr_netmask=32 \ op monitor interval=32s group ClusterIPGroup ClusterIP ClusterIPLocal ClusterIPSandbox clone ClusterIPClone ClusterIPGroup When both nodes of my two-node cluster are running, everything looks and functions OK. From service iptables status on node 1 (hypatia-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 On node 2 (orestes-tb): 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=2 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=2 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=2 hash_init=0 If I do a simple test of ssh'ing into 129.236.252.13, I see that I alternately login into hypatia-tb and orestes-tb. All is good. Now take orestes-tb offline. The iptables rules on hypatia-tb are unchanged: 5CLUSTERIP all -- 0.0.0.0/010.43.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=F1:87:E1:64:60:A5 total_nodes=2 local_node=1 hash_init=0 6CLUSTERIP all -- 0.0.0.0/010.44.7.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=11:8F:23:B9:CA:09 total_nodes=2 local_node=1 hash_init=0 7CLUSTERIP all -- 0.0.0.0/0129.236.252.13 CLUSTERIP hashmode=sourceip-sourceport clustermac=B1:95:5A:B5:16:79 total_nodes=2 local_node=1 hash_init=0 If I attempt to ssh to 129.236.252.13, whether or not I get in seems to be machine-dependent. On one machine I get in, from another I get a time-out. Both machines show the same MAC address for 129.236.252.13: arp 129.236.252.13 Address HWtype HWaddress Flags MaskIface hamilton-tb.nevis.colum ether B1:95:5A:B5:16:79 C eth0 Is this the way the cloned IPaddr2 resource is supposed to behave in the event of a node failure, or have I set things up incorrectly? -- Bill Seligman | Phone: (914) 591-2823 Nevis Labs, Columbia Univ | mailto://selig...@nevis.columbia.edu PO Box 137| Irvington NY 10533 USA| http://www.nevis.columbia.edu/~seligman/ smime.p7s Description: S/MIME Cryptographic Signature ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems