[ClusterLabs] Resource alerts

2018-10-23 Thread Leon Steffens
Hi all, We are trying to set up resource alerts on our Pacemaker 1.1.18 cluster. When a specific resource gets stopped, we want to trigger an alert on a *different* node to the node the resource is running on. But, it looks like resource alerts are only sent to the node the resource is being stop

Re: [ClusterLabs] crm_resource --wait

2017-10-21 Thread Leon Steffens
or utilization to sv-fencer. On Wed, 2017-10-11 at 14:01 +1000, Leon Steffens wrote: > I've attached two files: > 314 = after standby step > 315 = after resource update > > On Wed, Oct 11, 2017 at 12:22 AM, Ken Gaillot > wrote: > > On Tue, 2017-10-10 at 15:19 +1000,

Re: [ClusterLabs] crm_resource --wait

2017-10-10 Thread Leon Steffens
I've attached two files: 314 = after standby step 315 = after resource update On Wed, Oct 11, 2017 at 12:22 AM, Ken Gaillot wrote: > On Tue, 2017-10-10 at 15:19 +1000, Leon Steffens wrote: > > Hi Ken, > > > > I managed to reproduce this on a simplified version

Re: [ClusterLabs] crm_resource --wait

2017-10-09 Thread Leon Steffens
Hi Ken, I managed to reproduce this on a simplified version of the cluster, and on Pacemaker 1.1.15, 1.1.16, as well as 1.1.18-rc1 The steps to create the cluster are: pcs property set stonith-enabled=false pcs property set placement-strategy=balanced pcs node utilization vm1 cpu=100 pcs node u

Re: [ClusterLabs] crm_resource --wait

2017-10-09 Thread Leon Steffens
> > > Pending actions: > > Action 40: sv_fencer_monitor_6 on brilxvm44 > > Action 39: sv_fencer_start_0 on brilxvm44 > > Action 38: sv_fencer_stop_0 on brilxvm43 > > Error performing operation: Timer expired > > > > It looks like it's waiting for the sv_fencer fencing agent to start > > on bril

[ClusterLabs] crm_resource --wait

2017-10-08 Thread Leon Steffens
Hi all, We have a use case where we want to place a node into standby and then wait for all the resources to move off the node (and be started on other nodes) before continuing. In order to do this we call: $ pcs cluster standby brilxvm45 $ crm_resource --wait --timeout 300 This works most of th

Re: [ClusterLabs] Cannot stop cluster due to order constraint

2017-09-17 Thread Leon Steffens
>> >> pcs constraint order start main1 then stop backup1 kind=Serialize > > I think you want kind=Optional here. "Optional" means that if both > actions are needed in the same transition, perform them in this order, > otherwise it doesn't limit anything. "Serialize" means the start and > stop ca

Re: [ClusterLabs] Corosync on a home network

2017-09-11 Thread Leon Steffens
Is the firewalld service running? Just did a quick test on my Centos 7 installation and by default SSH is allowed through the firewall, but corosync cannot connect to the other nodes. Try: systemctl stop firewalld.service > On 12 Sep 2017, at 8:04 am, J Martin Rushton > wrote: > > Hi, >

[ClusterLabs] Cannot stop cluster due to order constraint

2017-09-07 Thread Leon Steffens
Hi all, We are running Pacemaker 1.1.15 under Centos 6.9, and have a simple 3-node cluster with 6 sets of "main" and "backup" resources (just Dummy ones): main1 backup1 main2 backup2 etc. We have the following co-location constraint between main1 and backup1 (-200 because we don't want them to b

Re: [ClusterLabs] Pacemaker in Azure

2017-08-24 Thread Leon Steffens
Unfortunately I can't post the full resource agent here. In our search for solutions we did find a resource agent for managing AWS Elastic IPs: https://github.com/moomindani/aws-eip-resource-agent/blob/master/eip. This was not what we wanted, but it will give you an idea of how it can work. Our

Re: [ClusterLabs] Pacemaker in Azure

2017-08-24 Thread Leon Steffens
nt on the IPaddr2 resource, and it worked fine. Leon Steffens On Fri, Aug 25, 2017 at 8:34 AM, Eric Robinson wrote: > > Don't use Azure? ;) > > That would be my preference. But since I'm stuck with Azure (management > decision) I need to come up with something. It appe