[ClusterLabs] Best-practices for changing networks settings in a cluster?

2018-11-05 Thread Ryan Thomas
I have a two node cluster. I restart the network after making changes to the network settings. But, as soon as I restart the network I see that corosync/pacemaker are killed - causing resources to failover to the other node. It looks like this is due to

Re: [ClusterLabs] How is fencing and unfencing suppose to work?

2018-09-28 Thread Ryan Thomas
ce was restarted, so it appeared the node I just killed, immediately came back. Thanks, Ryan On Tue, Sep 4, 2018 at 7:49 PM Ken Gaillot wrote: > On Tue, 2018-08-21 at 10:23 -0500, Ryan Thomas wrote: > > I’m seeing unexpected behavior when using “unfencing” – I don’t think > >

Re: [ClusterLabs] Q: ordering for a monitoring op only?

2018-08-21 Thread Ryan Thomas
You could accomplish this be creating a custom RA which normally acts as a pass-through and calls the "real" RA. However, it intercepts "monitor" actions, checks nfs, and if nfs is down it returns success, otherwise it passes though the monitor action to the real RA. If nfs fails the monitor

[ClusterLabs] How is fencing and unfencing suppose to work?

2018-08-21 Thread Ryan Thomas
I’m seeing unexpected behavior when using “unfencing” – I don’t think I’m understanding it correctly. I configured a resource that “requires unfencing” and have a custom fencing agent which “provides unfencing”. I perform a simple test where I setup the cluster and then run “pcs stonith fence

Re: [ClusterLabs] Q: HA_RSCTMP in SLES11 SP4 at first start after reboot

2018-08-13 Thread Ryan Thomas
I've had similar problems in the past. In my case, it was because pacemaker was running as user 'hacluster' in group 'haclient', so it didn't have permission to access the root owned file. So to fix the problem, I changed the ownership of the file that was causing the permissions error. e.g

Re: [ClusterLabs] How to implement a fencing agent

2018-08-09 Thread Ryan Thomas
;yum install fence-agents-common.x86_64" the appropriate way to install the fencing.py library? I suspect there may not be any best practice in this area. Thanks, Ryan On Thu, Aug 9, 2018 at 12:24 PM, Ryan Thomas wrote: > Thanks for the advice and information. > > >> 1. The d

Re: [ClusterLabs] How to implement a fencing agent

2018-08-09 Thread Ryan Thomas
fencing' > /usr/sbin/fence_foo" and "chmod +x /usr/sbin/fence_foo". After doing this the "stonith_admin --list-installed" did list fence_foo. However, the "pcs stonith list" > did not, so there must be something more that needs to be done for the fence

[ClusterLabs] How to implement a fencing agent

2018-08-08 Thread Ryan Thomas
I’m attempting to implement a fencing agent. The ClusterLabs/fence-agent github repo has some helpful information including fence-agents/doc/FenceAgentAPI.md, but I haven’t found the answers to a few basic questions. 1. The documentation encourages the use of the python fencing library. How

[ClusterLabs] Knowing where a resource is running

2018-05-16 Thread Ryan Thomas
I’m attempting to implement a resource that is “master” on only one node, but the other slave nodes know where the resource is running so they can forward requests to the “master” node. It seems like this can be accomplished by creating a multi-state resource with configured with 1 master with

[ClusterLabs] Failing operations immediately when node is known to be down

2018-04-12 Thread Ryan Thomas
I’m trying to implement a HA solution which recovers very quickly when a node fails. It my configuration, when I reboot a node, I see in the logs that pacemaker realizes the node is down, and decides to move all resources to the surviving node. To do this, it initiates a ‘stop’ operation on each