Re: [ClusterLabs] corosync - CS_ERR_BAD_HANDLE when multiple nodes are starting up
Hi, Thomas Lamprecht napsal(a): Hello, we are using corosync version needle (2.3.5) for our cluster filesystem (pmxcfs). The situation is the following. First we start up the pmxcfs, which is an fuse fs. And if there is an cluster configuration, we start also corosync. This allows the filesystem to exist on one node 'cluster's or forcing it in an local mode. We use CPG to send our messages to all members, the filesystem is in the RAM and all fs operations are sent 'over the wire'. The problem is now the following: When we're restarting all (in my test case 3) nodes at the same time, I get in 1 from 10 cases only CS_ERR_BAD_HANDLE back when calling I'm really unsure how to understand what are you doing. You are restarting all nodes and get CS_ERR_BAD_HANDLE? I mean, if you are restarting all nodes, which node returns CS_ERR_BAD_HANDLE? Or are you restarting just pmxcfs? Or just coorsync? cpg_mcast_joined to send out the data, but only one node. corosyn-quorumtool shows that we have quorum, and the logs are also showing a healthy connect to the corosync cluster. The failing handle is initialized once at the initialization of our filesystem. Should it be reinitialized on every reconnect? Again, I'm unsure what you mean by reconnect. On Corosync shudown you have to reconnect (I believe this is not the case because you are getting error only with 10% probability). Restarting the filesystem solves this problem, the strange thing is that isn't clearly reproduce-able and often works just fine. Are there some known problems or steps we should look for? Hard to tell but generally: - Make sure cpg_init really returns CS_OK. If not, returned handle is invalid - Make sure there is no memory corruption and handle is really valid (valgrind may be helpful). Regards, Honza ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] stopping a particular resource throughout the cluster
Hi. Is it possible to stop a resource running on all nodes from a single node. Say that i have resource A running on node A and resource A running on node B. Is it possible to disable the resource A from one node so that the resource A does not run on both nodes. -- With Regards P.Vijay ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] [Linux-HA] fence_ec2 agent
Hi Kazuhiko-san, On Mon, Sep 28, 2015 at 02:22:02PM +0900, 東一彦 wrote: > Hi Dejan, > > I made a patch file as unified diff by "hg export tip" command. > > Would you please marge it ? Merged. I just modified a bit the summary and patch description beforehand. Many thanks! Cheers, Dejan > > > Regards, > Kazuhiko Higashi > > On 2015/09/25 0:04, Dejan Muhamedagic wrote: > >Hi Kazuhiko-san, > > > >On Wed, Mar 25, 2015 at 10:47:01AM +0900, 東一彦 wrote: > >>Hi Markus, > >> > >>I implemented it for trial. > >> > >>[diff from http://hg.linux-ha.org/glue/rev/9da0680bc9c0 ] > >>50d49 > >>< port_default="" > >>60c59 > >>< ec2_tag=${tag} > >>--- > >>>[ -n "$tag" ] && ec2_tag="$tag" > >>63d61 > >>< : ${port=${port_default}} > >>97c95 > >>< > >>--- > >>> > >>105c103 > >>< > >>--- > >>> > >>132c130 > >>< > >>--- > >>> > >>142c140 > >>< > >>--- > >>> > >>221a220,224 > >>>function monitor() > >>>{ > >>> # Is the device ok? > >>> aws ec2 describe-instances $options | grep INSTANCES &> > >>> /dev/null > >>>} > >>267a271 > >>>[ -n "$2" ] && node_to_fence=$2 > >>326a331,334 > >>>if [ -z "$port" ]; then > >>> port="$node_to_fence" > >>>fi > >>> > >>379,380c387 > >>< # Is the device ok? > >>< aws ec2 describe-instances $options | grep INSTANCES &> > >>/dev/null > >>--- > >>> monitor > >>391c398 > >>< instance_status $instance > /dev/null > >>--- > >>> monitor > >> > >> > >> > >>It works fine on my environment with 2 patterns settings below. > >> > >>[pattern No.1] > >>Without "port" and "tag" parameters. > >>And instances has "Name=" tag. > >> > >> > >>primitive prmStonith1-2 stonith:external/ec2 \ > >> params \ > >> pcmk_off_timeout="120s" \ > >> op start interval="0s" timeout="60s" \ > >> op monitor interval="3600s" timeout="60s" \ > >> op stop interval="0s" timeout="60s" > >> > >> > >> > >>[pattern No.2] > >>With only "tag" parameter.(Without "port" parameter.) > >>And, The 1st instance(node01) has "Cluster1=node01" tag. > >>The 2nd instance(node02) has "Cluster1=node02" tag. > >> > >> > >>primitive prmStonith1-2 stonith:external/ec2 \ > >> params \ > >> pcmk_off_timeout="120s" \ > >> tag="Cluster1" \ > >> op start interval="0s" timeout="60s" \ > >> op monitor interval="3600s" timeout="60s" \ > >> op stop interval="0s" timeout="60s" > >> > > > >Sounds good. Sorry for the delay, but would it be possible that > >you provide a patch as unified diff or similar so that we can > >apply it. > > > >Cheers, > > > >Dejan > > > >> > >>Regards, > >>Kazuhiko Higashi > >> > >> > >>On 2015/03/24 20:48, 東一彦 wrote: > >>>Hi Markus, > >>> > >>>Thank you for the comment. > >>> > Would it be possible, to implement this idea as an additional > configuration method to the fence_ec2 agent? > >>>I think that your idea is good. > >>> > >>>So, I tries to implement it. > >>>I'm going to change the fence_ec2(ec2) the following points. > >>> > >>> - the "tag" and the "port" options will be "not" required. > >>> > >>> - if the "port" option is not set, the 2nd argument of ec2 will use as > >>> the "port". > >>>- the 2nd argument of ec2 is "node to fence". > >>> > >>> - the "stat" and "status" action will be same the "monitor" action. > >>>(for do not use the "port" parameter in "stat" action.) > >>> > >>> > >>>By the above modifications, If it is described uname in the Name tag, > >>>the setting of the "tag" and "port" parameters are no longer necessary. > >>> > >>> > >>>primitive prmStonith1-2 stonith:external/ec2 \ > >>> params \ > >>> pcmk_off_timeout="120s" \ > >>> op start interval="0s" timeout="60s" \ > >>> op monitor interval="3600s" timeout="60s" \ > >>> op stop interval="0s" timeout="60s" > >>> > >>> > >>> > >>>You can use "tag" parameter like your "Clustername" tag. > >>>If cluster nodes(instances) have "Cluster1" tag, and uname is described in > >>>that tag, > >>>it works just like you to expect. > >>> > >>> > >>>primitive prmStonith1-2 stonith:external/ec2 \ > >>> params \ > >>> pcmk_off_timeout="120s" \ > >>> tag="Cluster1" \ > >>> op start interval="0s" timeout="60s" \ > >>> op monitor interval="3600s" timeout="60s" \ > >>> op stop interval="0s" timeout="60s" > >>> > >>> > >>>The 1st instance have "Cluster1=node01" tag-key. > >>>The 2nd instance have "Cluster1=node02" tag-key. > >>>The 3rd instance have "Cluster1=node03" tag-key. > >>>... > >>>The prmStonith1-2 can fence node01 , node02 and node03. > >>> > >>> > >>>If you like above, I will implement that. > >>> > >>> > >>>Regards, > >>>Kazuhiko Higashi > >>> > >>> > >>>On 2015/03/19 1:03, Markus Guertler wrote: > Hi Kazuhiko, Dejan, >
Re: [ClusterLabs] IPaddr2 Unkown interface cause a failover that didn't work
Hi, On Wed, Sep 30, 2015 at 02:24:32PM -0400, Luc Paulin wrote: > Hi Everyone, > I have experience a weird issue last night where our cluster try to > failover due to an "Unkown interface" > > Look like when the IPaddr2 monitor try to perform a status on eth0, it > didn't find the device. Both node are VM. I haven't found any reason as why > eth0 would have "disapear" > > > [...] > Sep 29 21:25:06 node-02 pengine[3240]:error: unpack_rsc_op: Preventing > vip_v207_174 from re-starting anywhere: operation monitor failed 'not > configured' (6) The RA exits with the error code which says that the resource configuration is invalid. Hence PE won't try to start that resource again. Normally, we don't expect network interfaces to disappear, but this should probably be the "not installed" error, so that the resource can be started on another node. Or even the "generic" error in case it may be expected that interfaces can come and go. Did you figure why the interface disappeared? Thanks, Dejan > I know that I found some post that say to run sysctl -w > net.ipv4.conf.all.promote_secondaries=1 to avoid secondary nic to be remove > when primary is gone, but in this case the eth0 has a single nic that is > manage through IPaddr2 within crm configuration > > Here's the configuration or node: > > > Cluster Name: nodecluster1 > Corosync Nodes: > node-01 node-02 > Pacemaker Nodes: > node-01 node-02 > > Resources: > Group: lbpcivip > Resource: vip_v207_174 (class=ocf provider=heartbeat type=IPaddr2) >Attributes: ip=x.x.x.174 cidr_netmask=27 broadcast=x.x.x.191 nic=eth0 >Operations: monitor interval=10s (vip_v207_174-monitor-interval-10s) > Resource: vip_v26_1 (class=ocf provider=heartbeat type=IPaddr2) >Attributes: ip=x.x.26.1 >Operations: monitor interval=10s (vip_v26_1-monitor-interval-10s) > Resource: vip_v27_1 (class=ocf provider=heartbeat type=IPaddr2) >Attributes: ip=x.x.27.1 >Operations: monitor interval=10s (vip_v27_1-monitor-interval-10s) > Resource: vip_v254_230 (class=ocf provider=heartbeat type=IPaddr2) >Attributes: ip=x.x.254.230 >Operations: monitor interval=10s (vip_v254_230-monitor-interval-10s) > Resource: change-default-fw (class=lsb type=fwdefaultgw) >Operations: monitor interval=60s (change-default-fw-monitor-interval-60s) > Resource: fwcorp-mailto-sysadmin (class=ocf provider=heartbeat > type=MailTo) >Attributes: email=i...@touchtunes.com subject="[node - Clustered > services]" >Operations: monitor interval=60s > (fwcorp-mailto-sysadmin-monitor-interval-60s) > > Stonith Devices: > Fencing Levels: > > Location Constraints: > Ordering Constraints: > Colocation Constraints: > > Cluster Properties: > cluster-infrastructure: cman > dc-version: 1.1.11-97629de > last-lrm-refresh: 1412269491 > no-quorum-policy: ignore > stonith-enabled: false > > > Has anyone have suggestion on how I can solve this issue? Why did the > failover from node1 to node2 didn't work ? > > If more information is require let me know, any suggestion would be > appreciated! > > Thanx! > > > -- > ! >( o o ) > --oOO(_)OOo-- >Luc Paulin >email: paulinster(at)gmail.com >Skype: paulinster > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] stopping a particular resource throughout the cluster
Hi, On Thu, Oct 01, 2015 at 06:20:32PM +0530, Vijay Partha wrote: > Hi. > > Is it possible to stop a resource running on all nodes from a single node. > Say that i have resource A running on node A and resource A running on node > B. Is it possible to disable the resource A from one node so that the > resource A does not run on both nodes. That sounds like a cloned resource. You can just stop it and it won't run anywhere. Thanks, Dejan > -- > With Regards > P.Vijay > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: stopping a particular resource throughout the cluster
>>> Ulrich Windl schrieb am 01.10.2015 um 16:04 in Nachricht <560D3D5F.860 : >>> 161 : 60728>: Vijay Partha schrieb am 01.10.2015 um 14:50 in > Nachricht > : > > Hi. > > > > Is it possible to stop a resource running on all nodes from a single node. > > Say that i have resource A running on node A and resource A running on node > > B. Is it possible to disable the resource A from one node so that the > > resource A does not run on both nodes. > > 1) Not using clone > 2) use fencing > 3) use a location constraint > 4) ;-) 4) Means: "don't use a broken resource agent", of course... ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: stopping a particular resource throughout the cluster
>>> Vijay Partha schrieb am 01.10.2015 um 14:50 in Nachricht : > Hi. > > Is it possible to stop a resource running on all nodes from a single node. > Say that i have resource A running on node A and resource A running on node > B. Is it possible to disable the resource A from one node so that the > resource A does not run on both nodes. 1) Not using clone 2) use fencing 3) use a location constraint 4) ;-) > > -- > With Regards > P.Vijay ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] disable failover
Hi, I want to know how to disable failover. If a node undergoes a failover the resources running on the node should not be started on the other node in the cluster. How can this be achieved. -- With Regards P.Vijay ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] disable failover
On Thu, Oct 1, 2015 at 5:30 PM, Vijay Partha wrote: > Hi, > > I want to know how to disable failover. If a node undergoes a failover the > resources running on the node should not be started on the other node in the > cluster. How can this be achieved. > What exactly "node undergoes failover" means? Nodes do not failover - resources may fail over between nodes. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] disable failover
For example. Lets have a cluster of 2 nodes node A and node B. Say on node A i have resource A running. If node A goes down i dont want the resource A to start on node B. On Thu, Oct 1, 2015 at 8:18 PM, Andrei Borzenkov wrote: > On Thu, Oct 1, 2015 at 5:30 PM, Vijay Partha > wrote: > > Hi, > > > > I want to know how to disable failover. If a node undergoes a failover > the > > resources running on the node should not be started on the other node in > the > > cluster. How can this be achieved. > > > > What exactly "node undergoes failover" means? Nodes do not failover - > resources may fail over between nodes. > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- With Regards P.Vijay ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] disable failover
On Thu, Oct 1, 2015 at 5:54 PM, Vijay Partha wrote: > For example. Lets have a cluster of 2 nodes node A and node B. Say on node A > i have resource A running. If node A goes down i dont want the resource A to > start on node B. > Do you want it temporary (e.g. during maintenance) or permanently? Permanently you can define constraints. Temporary you can set is-managed to false for resources on this node (do not forget to undo it later). Or set global maintenance mode (but this affects all resources on all nodes). > On Thu, Oct 1, 2015 at 8:18 PM, Andrei Borzenkov > wrote: >> >> On Thu, Oct 1, 2015 at 5:30 PM, Vijay Partha >> wrote: >> > Hi, >> > >> > I want to know how to disable failover. If a node undergoes a failover >> > the >> > resources running on the node should not be started on the other node in >> > the >> > cluster. How can this be achieved. >> > >> >> What exactly "node undergoes failover" means? Nodes do not failover - >> resources may fail over between nodes. >> >> ___ >> Users mailing list: Users@clusterlabs.org >> http://clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > > > -- > With Regards > P.Vijay > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] disable failover
I want this to be done on permanent basis. Could you tell me the constaints that has to be given for this to be achieved. On Thu, Oct 1, 2015 at 8:36 PM, Andrei Borzenkov wrote: > On Thu, Oct 1, 2015 at 5:54 PM, Vijay Partha > wrote: > > For example. Lets have a cluster of 2 nodes node A and node B. Say on > node A > > i have resource A running. If node A goes down i dont want the resource > A to > > start on node B. > > > > Do you want it temporary (e.g. during maintenance) or permanently? > Permanently you can define constraints. Temporary you can set > is-managed to false for resources on this node (do not forget to undo > it later). Or set global maintenance mode (but this affects all > resources on all nodes). > > > On Thu, Oct 1, 2015 at 8:18 PM, Andrei Borzenkov > > wrote: > >> > >> On Thu, Oct 1, 2015 at 5:30 PM, Vijay Partha > >> wrote: > >> > Hi, > >> > > >> > I want to know how to disable failover. If a node undergoes a failover > >> > the > >> > resources running on the node should not be started on the other node > in > >> > the > >> > cluster. How can this be achieved. > >> > > >> > >> What exactly "node undergoes failover" means? Nodes do not failover - > >> resources may fail over between nodes. > >> > >> ___ > >> Users mailing list: Users@clusterlabs.org > >> http://clusterlabs.org/mailman/listinfo/users > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > > > > > > > > > > -- > > With Regards > > P.Vijay > > > > ___ > > Users mailing list: Users@clusterlabs.org > > http://clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- With Regards P.Vijay ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] disable failover
On 10/01/2015 09:54 AM, Vijay Partha wrote: > For example. Lets have a cluster of 2 nodes node A and node B. Say on node > A i have resource A running. If node A goes down i dont want the resource A > to start on node B. I assume the goal is to do this temporarily, for example, to perform some maintenance on resource A? (If not, why put it in HA in the first place?) You have a few options for temporary maintenance: * You can make a particular resource or resources "unmanaged", which means Pacemaker will no longer try to start or stop them. To do this, set the resource's "is-managed" meta-attribute to false. You might also want to disable any recurring monitor operations on them, by setting the monitor operation's "enabled" option to false. * You can put the entire cluster into maintenance mode, in which case all resources are made unmanaged. To do this, set the "maintenance-mode" cluster option to true. You can start and stop services as desired at that point, however you shouldn't move a service when it is unmanaged (i.e. start a service on a different node than the cluster last thought it was on). You can also put a node into standby mode to do maintenance on the node itself (e.g. reboot for a kernel update), but that will move all resources to the other node. Of course, remember to undo those changes when done with maintenance, and realize that Pacemaker may then decide to move resources around if circumstances call for it. > On Thu, Oct 1, 2015 at 8:18 PM, Andrei Borzenkov > wrote: > >> On Thu, Oct 1, 2015 at 5:30 PM, Vijay Partha >> wrote: >>> Hi, >>> >>> I want to know how to disable failover. If a node undergoes a failover >> the >>> resources running on the node should not be started on the other node in >> the >>> cluster. How can this be achieved. >>> >> >> What exactly "node undergoes failover" means? Nodes do not failover - >> resources may fail over between nodes. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] disable failover
Could u help me out on this please. On Thu, Oct 1, 2015 at 8:38 PM, Vijay Partha wrote: > I want this to be done on permanent basis. Could you tell me the > constaints that has to be given for this to be achieved. > > On Thu, Oct 1, 2015 at 8:36 PM, Andrei Borzenkov > wrote: > >> On Thu, Oct 1, 2015 at 5:54 PM, Vijay Partha >> wrote: >> > For example. Lets have a cluster of 2 nodes node A and node B. Say on >> node A >> > i have resource A running. If node A goes down i dont want the resource >> A to >> > start on node B. >> > >> >> Do you want it temporary (e.g. during maintenance) or permanently? >> Permanently you can define constraints. Temporary you can set >> is-managed to false for resources on this node (do not forget to undo >> it later). Or set global maintenance mode (but this affects all >> resources on all nodes). >> >> > On Thu, Oct 1, 2015 at 8:18 PM, Andrei Borzenkov >> > wrote: >> >> >> >> On Thu, Oct 1, 2015 at 5:30 PM, Vijay Partha > > >> >> wrote: >> >> > Hi, >> >> > >> >> > I want to know how to disable failover. If a node undergoes a >> failover >> >> > the >> >> > resources running on the node should not be started on the other >> node in >> >> > the >> >> > cluster. How can this be achieved. >> >> > >> >> >> >> What exactly "node undergoes failover" means? Nodes do not failover - >> >> resources may fail over between nodes. >> >> >> >> ___ >> >> Users mailing list: Users@clusterlabs.org >> >> http://clusterlabs.org/mailman/listinfo/users >> >> >> >> Project Home: http://www.clusterlabs.org >> >> Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> >> Bugs: http://bugs.clusterlabs.org >> > >> > >> > >> > >> > -- >> > With Regards >> > P.Vijay >> > >> > ___ >> > Users mailing list: Users@clusterlabs.org >> > http://clusterlabs.org/mailman/listinfo/users >> > >> > Project Home: http://www.clusterlabs.org >> > Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> > Bugs: http://bugs.clusterlabs.org >> > >> >> ___ >> Users mailing list: Users@clusterlabs.org >> http://clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > > > > -- > With Regards > P.Vijay > -- With Regards P.Vijay ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] disable failover
On 10/01/2015 05:35 PM, Vijay Partha wrote: > Could u help me out on this please. It would help if you could elaborate on the wish for an HA stack, if you don't want to use the stack. But if you don't want HA, then just do not install HA & do not configure this application as resource in the HA stack and start it on the command line / use the standard start-stop system of your Linux. greetings Kai Dupke Senior Product Manager Server Product Line -- Sell not virtue to purchase wealth, nor liberty to purchase power. Phone: +49-(0)5102-9310828 Mail: kdu...@suse.com Mobile: +49-(0)173-5876766 WWW: www.suse.com SUSE Linux GmbH - Maxfeldstr. 5 - 90409 Nuernberg (Germany) GF:Felix Imendörffer,Jane Smithard,Graham Norton,HRB 21284 (AG Nürnberg) ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Asterisk as a resource
Hi, I'm newbie so sorry for this questions.But I can't find any usuful doc. I added ocf resource agent of asterisk to my heartbeat lib. I used this command to add a resource :pcs resource create pbx ocf:heartbeat:asterisk params user="root" group="root" maxfiles="65536" op start interval="1" timeout="30s" op monitor interval="5s" timeout="30s" but when I run "pcs status", I received "FAILED (unmanaged)" and " pbx_start_0 on ha-1 'unknown error' (1): call=12, status=Timed Out, exitreason='none', last-rc-change='Thu Oct 1 23:40:53 2015', queued=0ms, exec=20003ms" errors. So what is problem? (I configured IPaddr2 too and It's work.) Thanks for reply. From: H Yavari Hi, I want to add Asterisk pbx as a rsource to pacemaker/corosync. I'm using that latest version (version 1.1.13-a14efad). I searched but I could find only old version configuration.Can you give me some hints for configs?Thanks. Regards. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] disable failover
i want pacemaker to monitor the resources running on each node and at the same time restart it. It should run on the same node. On Thu, Oct 1, 2015 at 9:17 PM, Kai Dupke wrote: > On 10/01/2015 05:35 PM, Vijay Partha wrote: > > Could u help me out on this please. > > It would help if you could elaborate on the wish for an HA stack, if you > don't want to use the stack. > > But if you don't want HA, then just do not install HA & do not configure > this application as resource in the HA stack and start it on the command > line / use the standard start-stop system of your Linux. > > greetings > Kai Dupke > Senior Product Manager > Server Product Line > -- > Sell not virtue to purchase wealth, nor liberty to purchase power. > Phone: +49-(0)5102-9310828 Mail: kdu...@suse.com > Mobile: +49-(0)173-5876766 WWW: www.suse.com > > SUSE Linux GmbH - Maxfeldstr. 5 - 90409 Nuernberg (Germany) > GF:Felix Imendörffer,Jane Smithard,Graham Norton,HRB 21284 (AG Nürnberg) > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > -- With Regards P.Vijay ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] disable failover
01.10.2015 19:09, Vijay Partha пишет: i want pacemaker to monitor the resources running on each node and at the same time restart it. It should run on the same node. Then create single node cluster. Why do you add second node if you do not want to use it? On Thu, Oct 1, 2015 at 9:17 PM, Kai Dupke wrote: On 10/01/2015 05:35 PM, Vijay Partha wrote: Could u help me out on this please. It would help if you could elaborate on the wish for an HA stack, if you don't want to use the stack. But if you don't want HA, then just do not install HA & do not configure this application as resource in the HA stack and start it on the command line / use the standard start-stop system of your Linux. greetings Kai Dupke Senior Product Manager Server Product Line -- Sell not virtue to purchase wealth, nor liberty to purchase power. Phone: +49-(0)5102-9310828 Mail: kdu...@suse.com Mobile: +49-(0)173-5876766 WWW: www.suse.com SUSE Linux GmbH - Maxfeldstr. 5 - 90409 Nuernberg (Germany) GF:Felix Imendörffer,Jane Smithard,Graham Norton,HRB 21284 (AG Nürnberg) ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Asterisk as a resource
On 10/01/2015 11:04 AM, H Yavari wrote: > Hi, > I'm newbie so sorry for this questions.But I can't find any usuful doc. > I added ocf resource agent of asterisk to my heartbeat lib. I used this > command to add a resource :pcs resource create pbx ocf:heartbeat:asterisk > params user="root" group="root" maxfiles="65536" op start interval="1" > timeout="30s" op monitor interval="5s" timeout="30s" > > but when I run "pcs status", I received "FAILED (unmanaged)" and " > pbx_start_0 on ha-1 'unknown error' (1): call=12, status=Timed Out, > exitreason='none', > last-rc-change='Thu Oct 1 23:40:53 2015', queued=0ms, exec=20003ms" > errors. > So what is problem? I'd take the asterisk resource out of the cluster first, and make sure it can be started manually with no errors. If so, I'd next try calling the resource agent directly to see what error it reports. I haven't used the asterisk resource agent so I can't be much more specific than that. FYI, some issues to consider when running asterisk HA: * The easiest setup is pure SIP. If you have a physical line (T1, ISDN, whatever), that complicates the situation significantly. * It's best to have two SIP providers so that you don't have a single point of failure there. FreePBX (based on asterisk) has some nice features to simplify this. * You need shared/replicated storage for asterisk's files (voice mails, etc.). * In the past, I've run FreePBX inside a VM, and made the VM the HA resource instead of asterisk directly. That can simplify the HA setup. VMs have more startup time, but there's the possible benefit of live migration. I expect using a Docker container would be another good alternative. > (I configured IPaddr2 too and It's work.) > > Thanks for reply. > > > From: H Yavari > > > > > Hi, > I want to add Asterisk pbx as a rsource to pacemaker/corosync. I'm using that > latest version (version 1.1.13-a14efad). I searched but I could find only old > version configuration.Can you give me some hints for configs?Thanks. > Regards. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] IPaddr2 Unkown interface cause a failover that didn't work
2015-10-01 9:30 GMT-04:00 Dejan Muhamedagic : > Hi, > > On Wed, Sep 30, 2015 at 02:24:32PM -0400, Luc Paulin wrote: > > Hi Everyone, > > I have experience a weird issue last night where our cluster try to > > failover due to an "Unkown interface" > > > > Look like when the IPaddr2 monitor try to perform a status on eth0, it > > didn't find the device. Both node are VM. I haven't found any reason as > why > > eth0 would have "disapear" > > > > > > [...] > > Sep 29 21:25:06 node-02 pengine[3240]:error: unpack_rsc_op: > Preventing > > vip_v207_174 from re-starting anywhere: operation monitor failed 'not > > configured' (6) > > The RA exits with the error code which says that the resource > configuration is invalid. Hence PE won't try to start that > resource again. Normally, we don't expect network interfaces to > disappear, but this should probably be the "not installed" error, > so that the resource can be started on another node. Or even the > "generic" error in case it may be expected that interfaces can > come and go. Did you figure why the interface disappeared? > > No we haven't been able to figure out why the interface disappeared. Actually it doesn't seem to have disappeared as we have no evidence that interface was gone from kernel log. As you say this should probably have be in the "not intstalled" or "generic" error so it tries to start it on another node, but obviously, network interface that disapear is not something that we expect to see. > Thanks, > > Dejan > > > I know that I found some post that say to run sysctl -w > > net.ipv4.conf.all.promote_secondaries=1 to avoid secondary nic to be > remove > > when primary is gone, but in this case the eth0 has a single nic that is > > manage through IPaddr2 within crm configuration > > > > Here's the configuration or node: > > > > > > Cluster Name: nodecluster1 > > Corosync Nodes: > > node-01 node-02 > > Pacemaker Nodes: > > node-01 node-02 > > > > Resources: > > Group: lbpcivip > > Resource: vip_v207_174 (class=ocf provider=heartbeat type=IPaddr2) > >Attributes: ip=x.x.x.174 cidr_netmask=27 broadcast=x.x.x.191 nic=eth0 > >Operations: monitor interval=10s (vip_v207_174-monitor-interval-10s) > > Resource: vip_v26_1 (class=ocf provider=heartbeat type=IPaddr2) > >Attributes: ip=x.x.26.1 > >Operations: monitor interval=10s (vip_v26_1-monitor-interval-10s) > > Resource: vip_v27_1 (class=ocf provider=heartbeat type=IPaddr2) > >Attributes: ip=x.x.27.1 > >Operations: monitor interval=10s (vip_v27_1-monitor-interval-10s) > > Resource: vip_v254_230 (class=ocf provider=heartbeat type=IPaddr2) > >Attributes: ip=x.x.254.230 > >Operations: monitor interval=10s (vip_v254_230-monitor-interval-10s) > > Resource: change-default-fw (class=lsb type=fwdefaultgw) > >Operations: monitor interval=60s > (change-default-fw-monitor-interval-60s) > > Resource: fwcorp-mailto-sysadmin (class=ocf provider=heartbeat > > type=MailTo) > >Attributes: email=i...@touchtunes.com subject="[node - Clustered > > services]" > >Operations: monitor interval=60s > > (fwcorp-mailto-sysadmin-monitor-interval-60s) > > > > Stonith Devices: > > Fencing Levels: > > > > Location Constraints: > > Ordering Constraints: > > Colocation Constraints: > > > > Cluster Properties: > > cluster-infrastructure: cman > > dc-version: 1.1.11-97629de > > last-lrm-refresh: 1412269491 > > no-quorum-policy: ignore > > stonith-enabled: false > > > > > > Has anyone have suggestion on how I can solve this issue? Why did the > > failover from node1 to node2 didn't work ? > > > > If more information is require let me know, any suggestion would be > > appreciated! > > > > Thanx! > > > > > > -- > > ! > >( o o ) > > --oOO(_)OOo-- > >Luc Paulin > >email: paulinster(at)gmail.com > >Skype: paulinster > > > ___ > > Users mailing list: Users@clusterlabs.org > > http://clusterlabs.org/mailman/listinfo/users > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: Re: disable failover
>>> Kai Dupke schrieb am 01.10.2015 um 17:47 in Nachricht <560d5598.2080...@suse.com>: > On 10/01/2015 05:35 PM, Vijay Partha wrote: >> Could u help me out on this please. > > It would help if you could elaborate on the wish for an HA stack, if you > don't want to use the stack. > > But if you don't want HA, then just do not install HA & do not configure > this application as resource in the HA stack and start it on the command > line / use the standard start-stop system of your Linux. Maybe monit ist the solution for this case. > > greetings > Kai Dupke > Senior Product Manager > Server Product Line > -- > Sell not virtue to purchase wealth, nor liberty to purchase power. > Phone: +49-(0)5102-9310828 Mail: kdu...@suse.com > Mobile: +49-(0)173-5876766 WWW: www.suse.com > > SUSE Linux GmbH - Maxfeldstr. 5 - 90409 Nuernberg (Germany) > GF:Felix Imendörffer,Jane Smithard,Graham Norton,HRB 21284 (AG Nürnberg) > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org