Re: [ClusterLabs] Pacemaker not invoking monitor after $interval

2016-05-20 Thread Felix Zachlod
> -Ursprüngliche Nachricht-
> Von: Jehan-Guillaume de Rorthais [mailto:j...@dalibo.com]
> Gesendet: Freitag, 20. Mai 2016 13:52
> An: Felix Zachlod (Lists) 
> Cc: users@clusterlabs.org
> Betreff: Re: [ClusterLabs] Pacemaker not invoking monitor after $interval
> 
> Le Fri, 20 May 2016 11:33:39 +,
> "Felix Zachlod (Lists)"  a écrit :
> 
> > Hello!
> >
> > I am currently working on a cluster setup which includes several resources
> > with "monitor interval=XXs" set. As far as I understand this should run the
> > monitor action on the resource agent every XX seconds. But it seems it
> > doesn't.
> 
> How do you know it doesn't? Are you looking at crm_mon? log files?

I created a debug output from my RA. Furthermore I had a blackbox dump.
But it now turned out, that for my resource I had to change meta-data to 
advertise 

monitor action twice (one for slave, one for master) and setup 

op monitor role=x interval=y instead of op monitor interval=x

Since that I changed it at least for this resource monitor is working as 
desired. At least for now. Not sure why a Master/Slave resource has to have 
distinct monitor actions advertised for both roles but it seems related to that.

Still don't see any monitor invocations in the log but seems there is still 
something wrong with the log level.

Thanks anyway!

regards, Felix
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] cluster stops randomly

2016-05-20 Thread ‪H Yavari‬ ‪
Hi,
I have a cluster and it works good, but I see sometimes cluster is stopped on 
all nodes and I should start manually. pcsd service is running but cluster is 
stopped.I see the pacemaker log but I couldn't find any warning or error. what 
is the issue? 
(stonith is disable.)

Regards,H.Yavari
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Node attributes

2016-05-20 Thread ‪H Yavari‬ ‪
Thank you. I used this and it works.
Regards.

  From: Ken Gaillot 
 To: users@clusterlabs.org 
 Sent: Thursday, 19 May 2016, 19:34:08
 Subject: Re: [ClusterLabs] Node attributes
   
On 05/18/2016 10:49 PM, ‪H Yavari‬ ‪ wrote:
> Hi,
> 
> How can I define a constraint for two resource based on one nodes
> attribute?
> 
> For example resource X and Y are co-located based on node attribute Z.
> 
> 
> 
> Regards,
> H.Yavari

Hi,

See
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm140617356537136

High-level tools such as pcs and crm provide a simpler interface, but
the concepts will be the same.

This works for location constraints, not colocation, but you can easily
accomplish what you want. If your goal is that X and Y each can only run
on a node with attribute Z, then set up a location constraint for each
one using the appropriate rule. If you goal is that X and Y must be
colocated together, on a node with attribute Z, then set up a regular
colocation constraint between them, and a location constraint for one of
them with the appropriate rule; or, put them in a group, and set up a
location constraint for the group with the appropriate rule.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


  ___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Resource seems to not obey constraint

2016-05-20 Thread Ken Gaillot
On 05/20/2016 10:29 AM, Leon Botes wrote:
> I push the following config.
> The iscsi-target fails as it tries to start on iscsiA-node1
> This is because I have no target installed on iscsiA-node1 which is by
> design. All services listed here should only start on  iscsiA-san1
> iscsiA-san2.
> I am using using the iscsiA-node1 basically for quorum and some other
> minor functions.
> 
> Can someone please show me where I am going wrong?
> All services should start on the same node, order is drbd-master
> vip-blue vip-green iscsi-target iscsi-lun
> 
> pcs -f ha_config property set symmetric-cluster="true"
> pcs -f ha_config property set no-quorum-policy="stop"
> pcs -f ha_config property set stonith-enabled="false"
> pcs -f ha_config resource defaults resource-stickiness="200"
> 
> pcs -f ha_config resource create drbd ocf:linbit:drbd drbd_resource=r0
> op monitor interval=60s
> pcs -f ha_config resource master drbd master-max=1 master-node-max=1
> clone-max=2 clone-node-max=1 notify=true
> pcs -f ha_config resource create vip-blue ocf:heartbeat:IPaddr2
> ip=192.168.101.100 cidr_netmask=32 nic=blue op monitor interval=20s
> pcs -f ha_config resource create vip-green ocf:heartbeat:IPaddr2
> ip=192.168.102.100 cidr_netmask=32 nic=green op monitor interval=20s
> pcs -f ha_config resource create iscsi-target ocf:heartbeat:iSCSITarget
> params iqn="iqn.2016-05.trusc.net" implementation="lio-t" op monitor
> interval="30s"
> pcs -f ha_config resource create iscsi-lun
> ocf:heartbeat:iSCSILogicalUnit params target_iqn="iqn.2016-05.trusc.net"
> lun="1" path="/dev/drbd0"
> 
> pcs -f ha_config constraint colocation add vip-blue drbd-master INFINITY
> with-rsc-role=Master
> pcs -f ha_config constraint colocation add vip-green drbd-master
> INFINITY with-rsc-role=Master
> 
> pcs -f ha_config constraint location drbd-master prefers stor-san1=500
> pcs -f ha_config constraint location drbd-master avoids stor-node1=INFINITY

The above constraint is an example of how to ban a resource from a node.
However stor-node1 is not a valid node name in your setup (maybe an
earlier design?), so this particular constraint won't have any effect.
If you want to ban certain resources from iscsiA-node1, add constraints
like the above for each resource, using the correct node name.

> pcs -f ha_config constraint order promote drbd-master then start vip-blue
> pcs -f ha_config constraint order start vip-blue then start vip-green
> pcs -f ha_config constraint order start vip-green then start iscsi-target
> pcs -f ha_config constraint order start iscsi-target then start iscsi-lun
> 
> Results:
> 
> [root@san1 ~]# pcs status
> Cluster name: storage_cluster
> Last updated: Fri May 20 17:21:10 2016  Last change: Fri May 20
> 17:19:43 2016 by root via cibadmin on iscsiA-san1
> Stack: corosync
> Current DC: iscsiA-san1 (version 1.1.13-10.el7_2.2-44eb2dd) - partition
> with quorum
> 3 nodes and 6 resources configured
> 
> Online: [ iscsiA-node1 iscsiA-san1 iscsiA-san2 ]
> 
> Full list of resources:
> 
>  Master/Slave Set: drbd-master [drbd]
>  Masters: [ iscsiA-san1 ]
>  Slaves: [ iscsiA-san2 ]
>  vip-blue   (ocf::heartbeat:IPaddr2):   Started iscsiA-san1
>  vip-green  (ocf::heartbeat:IPaddr2):   Started iscsiA-san1
>  iscsi-target   (ocf::heartbeat:iSCSITarget):   FAILED iscsiA-node1
> (unmanaged)
>  iscsi-lun  (ocf::heartbeat:iSCSILogicalUnit):  Stopped
> 
> Failed Actions:
> * drbd_monitor_0 on iscsiA-node1 'not installed' (5): call=6, status=Not
> installed, exitreason='none',
> last-rc-change='Fri May 20 17:19:44 2016', queued=0ms, exec=0ms
> * iscsi-target_stop_0 on iscsiA-node1 'not installed' (5): call=24,
> status=complete, exitreason='Setup problem: couldn't find command:
> targetcli',
> last-rc-change='Fri May 20 17:19:45 2016', queued=0ms, exec=18ms
> * iscsi-lun_monitor_0 on iscsiA-node1 'not installed' (5): call=22,
> status=complete, exitreason='Undefined iSCSI target implementation',
> last-rc-change='Fri May 20 17:19:44 2016', queued=0ms, exec=27ms

The above failures will still occur even if you add the proper
constraints, because these are probes. Before starting a resource,
Pacemaker probes it on all nodes, to make sure it's not already running
somewhere. You can prevent this when you know it is impossible that the
resource could be running on a particular node, by adding
resource-discovery=never when creating the constraint banning it from
that node.

> 
> PCSD Status:
>   iscsiA-san1: Online
>   iscsiA-san2: Online
>   iscsiA-node1: Online
> 
> Daemon Status:
>   corosync: active/disabled
>   pacemaker: active/disabled
>   pcsd: active/disabled
> 


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Issue in resource constraints and fencing - RHEL7 - AWS EC2

2016-05-20 Thread Ken Gaillot
On 05/20/2016 10:02 AM, Pratip Ghosh wrote:
> Hi All,
> 
> I am implementing 2 node RedHat (RHEL 7.2) HA cluster on Amazon EC2
> instance. For floating IP I am using a shell script provided by AWS so
> that virtual IP float to another instance if any one server failed with
> health check. In basic level cluster is working but I have 2 issues on
> that which I describe in bellow.
> 
> ISSUE 1
> =
> Now I need to configure fencing/STONITH to avoid split brain scenario in
> storage cluster. I want to use multi-primari (Active/Active) DRBD in my
> cluster for distributed storage. Is it possible to configure power
> fencing on AWS EC2 instance? Can any one please guide me on this?

There has been some discussion about this on this list before -- see
http://search.gmane.org/?query=ec2&group=gmane.comp.clustering.clusterlabs.user

Basically, there is an outdated agent available at
https://github.com/beekhof/fence_ec2 and a newer fork of it in the
(RHEL-incompatible) cluster-glue package. So with some work you may be
able to get something working.

> 
> ISSUE2
> =
> Currently I am using single  primary DRBD distributed storage. I added
> cluster resources so that if any cluster node goes down then another
> cluster node will promoted DRBD volume as primary and mount it on
> /var/www/html.
> 
> This configuration is working but for only if cluster node1 goes down.
> If cluster node2 goes down all cluster resources fails over towards
> cluster node1 but whenever cluster node2 again become on-line then
> virtual_ip (cluster ip) ownership automatically goes towards cluster
> node2 again. All the remaining resources not failed over like that. In
> that case secondary IP stays with Node1 and ownership goes to Node2.
> 
> I think this is an issue with resource stickiness or resource constraint
> but here I am totally clueless. Can any one please help me on this?
> 
> 
> My cluster details:
> ===
> 
> [root@drbd01 ~]# pcs config
> Cluster Name: web_cluster
> Corosync Nodes:
>  ec2-52-24-8-124.us-west-2.compute.amazonaws.com
> ec2-52-27-70-12.us-west-2.compute.amazonaws.com
> Pacemaker Nodes:
>  ec2-52-24-8-124.us-west-2.compute.amazonaws.com
> ec2-52-27-70-12.us-west-2.compute.amazonaws.com
> 
> Resources:
>  Resource: virtual_ip (class=ocf provider=heartbeat type=IPaddr2)
>   Attributes: ip=10.98.70.100 cidr_netmask=24
>   Operations: start interval=0s timeout=20s (virtual_ip-start-interval-0s)
>   stop interval=0s timeout=20s (virtual_ip-stop-interval-0s)
>   monitor interval=30s (virtual_ip-monitor-interval-30s)
>  Resource: WebSite (class=ocf provider=heartbeat type=apache)
>   Attributes: configfile=/etc/httpd/conf/httpd.conf
> statusurl=http://10.98.70.100/server-status
>   Operations: start interval=0s timeout=40s (WebSite-start-interval-0s)
>   stop interval=0s timeout=60s (WebSite-stop-interval-0s)
>   monitor interval=1min (WebSite-monitor-interval-1min)
>  Master: WebDataClone
>   Meta Attrs: master-max=1 master-node-max=1 clone-max=2
> clone-node-max=1 notify=true
>   Resource: WebData (class=ocf provider=linbit type=drbd)
>Attributes: drbd_resource=r1
>Operations: start interval=0s timeout=240 (WebData-start-interval-0s)
>promote interval=0s timeout=90 (WebData-promote-interval-0s)
>demote interval=0s timeout=90 (WebData-demote-interval-0s)
>stop interval=0s timeout=100 (WebData-stop-interval-0s)
>monitor interval=60s (WebData-monitor-interval-60s)
>  Resource: WebFS (class=ocf provider=heartbeat type=Filesystem)
>   Attributes: device=/dev/drbd1 directory=/var/www/html fstype=xfs
>   Operations: start interval=0s timeout=60 (WebFS-start-interval-0s)
>   stop interval=0s timeout=60 (WebFS-stop-interval-0s)
>   monitor interval=20 timeout=40 (WebFS-monitor-interval-20)
> 
> Stonith Devices:
> Fencing Levels:
> 
> Location Constraints:
> Ordering Constraints:
>   promote WebDataClone then start WebFS (kind:Mandatory)
> (id:order-WebDataClone-WebFS-mandatory)
>   start WebFS then start virtual_ip (kind:Mandatory)
> (id:order-WebFS-virtual_ip-mandatory)
>   start virtual_ip then start WebSite (kind:Mandatory)
> (id:order-virtual_ip-WebSite-mandatory)
> Colocation Constraints:
>   WebSite with virtual_ip (score:INFINITY)
> (id:colocation-WebSite-virtual_ip-INFINITY)
>   WebFS with WebDataClone (score:INFINITY) (with-rsc-role:Master)
> (id:colocation-WebFS-WebDataClone-INFINITY)
>   WebSite with WebFS (score:INFINITY)
> (id:colocation-WebSite-WebFS-INFINITY)
> 
> Resources Defaults:
>  resource-stickiness: INFINITY

You don't have any constraints requiring virtual_ip to stay with any
other resource. So it doesn't.

You could colocate virtual_ip with WebFS, and drop the colocation of
WebSite with WebFS, but it would probably be easier to configure a group
with WebFS, virtual_ip, WebSite, and WebFS. Then you would only need
promote WebDataClone then start the 

Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-05-20 Thread Adam Spiers
Klaus Wenninger  wrote:
> On 05/20/2016 08:39 AM, Ulrich Windl wrote:
>  Jehan-Guillaume de Rorthais  schrieb am 19.05.2016 um 
>  21:29 in
> > Nachricht <20160519212947.6cc0fd7b@firost>:
> > [...]
> >> I was thinking of a use case where a graceful demote or stop action failed
> >> multiple times and to give a chance to the RA to choose another method to 
> >> stop
> >> the resource before it requires a migration. As instance, PostgreSQL has 3
> >> different kind of stop, the last one being not graceful, but still better 
> >> than
> >> a kill -9.
> >
> > For example the Xen RA tries a clean shutdown with a timeout of
> > about 2/3 of the timeout; it it fails it shuts the VM down the
> > hard way.
> >
> > I don't know Postgres in detail, but I could imagine a three step approach:
> > 1) Shutdown after current operations have finished
> > 2) Shutdown regardless of pending operations (doing rollbacks)
> > 3) Shutdown the hard way, requiring recovery on the next start (I think in 
> > Oracle this is called a "shutdown abort")
> >
> > Depending on the scenario one may start at step 2)
> >
> > [...]
> > I think RAs should not rely on "stop" being called multiple times for a 
> > resource to be stopped.

Well, this would be a major architectural change.  Currently if
stop fails once, the node gets fenced - period.  So if we changed
this, there would presumably be quite a bit of scope for making the
new design address whatever concerns you have about relying on "stop"
*sometimes* needing to be called multiple times.  For the sake of
backwards compatibility with existing RAs, I think we'd have to ensure
the current semantics still work.  But maybe there could be a new
option where RAs are allowed to return OCF_RETRY_STOP to indicate that
they want to escalate, or something.  However it's not clear how that
would be distinguished from an old RA returning the same value as
whatever we chose for OCF_RETRY_STOP.

> I see a couple of positive points in having something inside pacemaker
> that helps the RAs escalating
> their stop strategy:
> 
> - this way you have the same logging for all RAs - done within the RA it
> would look different with each of them
> - timeout-retry stuff is potentially prone to not being implemented
> properly - like this you have a proven
>   implementation within pacemaker
> - keeps logic within RA simpler and guides implementation in a certain
> direction that makes them look
>   more similar to each other making it easier to understand an RA you
> haven't seen before

Yes, all good points which I agree with.

> Of course there are basically two approaches to achieve this:
> 
> - give some global or per resource view of pacemaker to the RA and leave
> it to the RA to act in a
>   responsible manner (like telling the RA that there are x stop-retries
> to come)
> - handle the escalation withing pacemaker and already tell the RA what
> you expect it to do
>   like requesting a graceful / hard / emergency or however you would
> call it stop

I'd probably prefer the former, to avoid hardcoding any assumptions
about the different levels of escalation the RA might want to take.
That would almost certainly vary per RA.

However, we're slightly off-topic for this thread at this point ;-)

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-05-20 Thread Adam Spiers
Ken Gaillot  wrote:
> A recent thread discussed a proposed new feature, a new environment
> variable that would be passed to resource agents, indicating whether a
> stop action was part of a recovery.
> 
> Since that thread was long and covered a lot of topics, I'm starting a
> new one to focus on the core issue remaining:
> 
> The original idea was to pass the number of restarts remaining before
> the resource will no longer tried to be started on the same node. This
> involves calculating (fail-count - migration-threshold), and that
> implies certain limitations: (1) it will only be set when the cluster
> checks migration-threshold; (2) it will only be set for the failed
> resource itself, not for other resources that may be recovered due to
> dependencies on it.
> 
> Ulrich Windl proposed an alternative: setting a boolean value instead. I
> forgot to cc the list on my reply, so I'll summarize now: We would set a
> new variable like OCF_RESKEY_CRM_recovery=true whenever a start is
> scheduled after a stop on the same node in the same transition. This
> would avoid the corner cases of the previous approach; instead of being
> tied to migration-threshold, it would be set whenever a recovery was
> being attempted, for any reason. And with this approach, it should be
> easier to set the variable for all actions on the resource
> (demote/stop/start/promote), rather than just the stop.
> 
> I think the boolean approach fits all the envisioned use cases that have
> been discussed. Any objections to going that route instead of the count?

I think that sounds fine to me.  Thanks!

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: FR: send failcount to OCF RA start/stop actions

2016-05-20 Thread Adam Spiers
Ken Gaillot  wrote:
> On 05/12/2016 06:21 AM, Adam Spiers wrote:
> > Ken Gaillot  wrote:
> >> On 05/10/2016 02:29 AM, Ulrich Windl wrote:
>  Here is what I'm testing currently:
> 
>  - When the cluster recovers a resource, the resource agent's stop action
>  will get a new variable, OCF_RESKEY_CRM_meta_recovery_left =
>  migration-threshold - fail-count on the local node.

[snipped]

> > I'd prefer plural (OCF_RESKEY_CRM_meta_recoveries_left) but other than
> > that I think it's good.  OCF_RESKEY_CRM_meta_retries_left is shorter;
> > not sure whether it's marginally worse or better though.
> 
> I'm now leaning to restart_remaining (restarts_remaining would be just
> as good).

restarts_remaining would be better IMHO, given that it's expected that
often multiple restarts will be remaining.

[snipped]

> > OK, so the RA code would typically be something like this?
> > 
> > if [ ${OCF_RESKEY_CRM_meta_retries_left:-0} = 0 ]; then
> > # This is the final stop, so tell the external service
> > # not to send any more work our way.
> > disable_service
> > fi
> 
> I'd use -eq :) but yes

Right, -eq is better style for numeric comparison :-)

[snipped]

>  -- If a resource is being recovered, but the fail-count is being cleared
>  in the same transition, the cluster will ignore migration-threshold (and
>  the variable will not be set). The RA might see recovery_left=5, 4, 3,
>  then someone clears the fail-count, and it won't see recovery_left even
>  though there is a stop and start being attempted.
> > 
> > Hmm.  So how would the RA distinguish that case from the one where
> > the stop is final?
> 
> That's the main question in all this. There are quite a few scenarios
> where there's no meaningful distinction between 0 and unset. With the
> current implementation at least, the ideal approach is for the RA to
> treat the last stop before a restart the same as a final stop.

OK ...

[snipped]

> > So IIUC, you are talking about a scenario like this:
> > 
> > 1. The whole group starts fine.
> > 2. Some time later, the neutron openvswitch agent crashes.
> > 3. Pacemaker shuts down nova-compute since it depends upon
> >the neutron agent due to being later in the same group.
> > 4. Pacemaker repeatedly tries to start the neutron agent,
> >but reaches migration-threshold.
> > 
> > At this point, nova-compute is permanently down, but its RA never got
> > passed OCF_RESKEY_CRM_meta_retries_left with a value of 0 or unset,
> > so it never knew to do a nova service-disable.
> 
> Basically right, but it would be unset (not empty -- it's never empty).
> 
> However, this is a solvable issue. If it's important, I can add the
> variable to all siblings of the failed resource if the entire group
> would be forced away.

Good to hear.

> > (BTW, in this scenario, the group is actually cloned, so no migration
> > to another compute node happens.)
> 
> Clones are the perfect example of the lack of distinction between 0 and
> unset. For an anonymous clone running on all nodes, the countdown will
> be 3,2,1,unset because the specific clone instance doesn't need to be
> started anywhere else (it looks more like a final stop of that
> instance). But for unique clones, or anonymous clones where another node
> is available to run the instance, it might be 0.

I see, thanks.

> > Did I get that right?  If so, yes it does sound like an issue.  Maybe
> > it is possible to avoid this problem by avoiding the use of groups,
> > and instead just use interleaved clones with ordering constraints
> > between them?
> 
> That's not any better, and in fact it would be more difficult to add the
> variable to the dependent resource in such a situation, compared to a group.
> 
> Generally, only the failed resource will get the variable, not resources
> that may be stopped and started because they depend on the failed
> resource in some way.

OK.  So that might be a problem for you guys than for us, since we use
cloned groups, and you don't:

https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/high-availability-for-compute-instances/chapter-1-use-high-availability-to-protect-instances

> >> More generally, I suppose the point is to better support services that
> >> can do a lesser tear-down for a stop-start cycle than a full stop. The
> >> distinction between the two cases may not be 100% clear (as with your
> >> fencing example), but the idea is that it would be used for
> >> optimization, not some required behavior.
> > 
> > This discussion is prompting me to get this clearer in my head, which
> > is good :-)
> > 
> > I suppose we *could* simply modify the existing NovaCompute OCF RA so
> > that every time it executes the 'stop' action, it immediately sends
> > the service-disable message to nova-api, and similarly send
> > service-enable during the 'start' action.  However this probably has a
> > few downsides:
> > 
> > 1. It could cause rapid flapping o

[ClusterLabs] Resource seems to not obey constraint

2016-05-20 Thread Leon Botes

I push the following config.
The iscsi-target fails as it tries to start on iscsiA-node1
This is because I have no target installed on iscsiA-node1 which is by 
design. All services listed here should only start on  iscsiA-san1 
iscsiA-san2.
I am using using the iscsiA-node1 basically for quorum and some other 
minor functions.


Can someone please show me where I am going wrong?
All services should start on the same node, order is drbd-master 
vip-blue vip-green iscsi-target iscsi-lun


pcs -f ha_config property set symmetric-cluster="true"
pcs -f ha_config property set no-quorum-policy="stop"
pcs -f ha_config property set stonith-enabled="false"
pcs -f ha_config resource defaults resource-stickiness="200"

pcs -f ha_config resource create drbd ocf:linbit:drbd drbd_resource=r0 
op monitor interval=60s
pcs -f ha_config resource master drbd master-max=1 master-node-max=1 
clone-max=2 clone-node-max=1 notify=true
pcs -f ha_config resource create vip-blue ocf:heartbeat:IPaddr2 
ip=192.168.101.100 cidr_netmask=32 nic=blue op monitor interval=20s
pcs -f ha_config resource create vip-green ocf:heartbeat:IPaddr2 
ip=192.168.102.100 cidr_netmask=32 nic=green op monitor interval=20s
pcs -f ha_config resource create iscsi-target ocf:heartbeat:iSCSITarget 
params iqn="iqn.2016-05.trusc.net" implementation="lio-t" op monitor 
interval="30s"
pcs -f ha_config resource create iscsi-lun 
ocf:heartbeat:iSCSILogicalUnit params target_iqn="iqn.2016-05.trusc.net" 
lun="1" path="/dev/drbd0"


pcs -f ha_config constraint colocation add vip-blue drbd-master INFINITY 
with-rsc-role=Master
pcs -f ha_config constraint colocation add vip-green drbd-master 
INFINITY with-rsc-role=Master


pcs -f ha_config constraint location drbd-master prefers stor-san1=500
pcs -f ha_config constraint location drbd-master avoids stor-node1=INFINITY

pcs -f ha_config constraint order promote drbd-master then start vip-blue
pcs -f ha_config constraint order start vip-blue then start vip-green
pcs -f ha_config constraint order start vip-green then start iscsi-target
pcs -f ha_config constraint order start iscsi-target then start iscsi-lun

Results:

[root@san1 ~]# pcs status
Cluster name: storage_cluster
Last updated: Fri May 20 17:21:10 2016  Last change: Fri May 20 
17:19:43 2016 by root via cibadmin on iscsiA-san1

Stack: corosync
Current DC: iscsiA-san1 (version 1.1.13-10.el7_2.2-44eb2dd) - partition 
with quorum

3 nodes and 6 resources configured

Online: [ iscsiA-node1 iscsiA-san1 iscsiA-san2 ]

Full list of resources:

 Master/Slave Set: drbd-master [drbd]
 Masters: [ iscsiA-san1 ]
 Slaves: [ iscsiA-san2 ]
 vip-blue   (ocf::heartbeat:IPaddr2):   Started iscsiA-san1
 vip-green  (ocf::heartbeat:IPaddr2):   Started iscsiA-san1
 iscsi-target   (ocf::heartbeat:iSCSITarget):   FAILED iscsiA-node1 
(unmanaged)

 iscsi-lun  (ocf::heartbeat:iSCSILogicalUnit):  Stopped

Failed Actions:
* drbd_monitor_0 on iscsiA-node1 'not installed' (5): call=6, status=Not 
installed, exitreason='none',

last-rc-change='Fri May 20 17:19:44 2016', queued=0ms, exec=0ms
* iscsi-target_stop_0 on iscsiA-node1 'not installed' (5): call=24, 
status=complete, exitreason='Setup problem: couldn't find command: 
targetcli',

last-rc-change='Fri May 20 17:19:45 2016', queued=0ms, exec=18ms
* iscsi-lun_monitor_0 on iscsiA-node1 'not installed' (5): call=22, 
status=complete, exitreason='Undefined iSCSI target implementation',

last-rc-change='Fri May 20 17:19:44 2016', queued=0ms, exec=27ms


PCSD Status:
  iscsiA-san1: Online
  iscsiA-san2: Online
  iscsiA-node1: Online

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/disabled

--
Regards

Leon

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Issue in resource constraints and fencing - RHEL7 - AWS EC2

2016-05-20 Thread Pratip Ghosh

Hi All,

I am implementing 2 node RedHat (RHEL 7.2) HA cluster on Amazon EC2 
instance. For floating IP I am using a shell script provided by AWS so 
that virtual IP float to another instance if any one server failed with 
health check. In basic level cluster is working but I have 2 issues on 
that which I describe in bellow.


ISSUE 1
=
Now I need to configure fencing/STONITH to avoid split brain scenario in 
storage cluster. I want to use multi-primari (Active/Active) DRBD in my 
cluster for distributed storage. Is it possible to configure power 
fencing on AWS EC2 instance? Can any one please guide me on this?



ISSUE2
=
Currently I am using single  primary DRBD distributed storage. I added 
cluster resources so that if any cluster node goes down then another 
cluster node will promoted DRBD volume as primary and mount it on 
/var/www/html.


This configuration is working but for only if cluster node1 goes down. 
If cluster node2 goes down all cluster resources fails over towards 
cluster node1 but whenever cluster node2 again become on-line then 
virtual_ip (cluster ip) ownership automatically goes towards cluster 
node2 again. All the remaining resources not failed over like that. In 
that case secondary IP stays with Node1 and ownership goes to Node2.


I think this is an issue with resource stickiness or resource constraint 
but here I am totally clueless. Can any one please help me on this?



My cluster details:
===

[root@drbd01 ~]# pcs config
Cluster Name: web_cluster
Corosync Nodes:
 ec2-52-24-8-124.us-west-2.compute.amazonaws.com 
ec2-52-27-70-12.us-west-2.compute.amazonaws.com

Pacemaker Nodes:
 ec2-52-24-8-124.us-west-2.compute.amazonaws.com 
ec2-52-27-70-12.us-west-2.compute.amazonaws.com


Resources:
 Resource: virtual_ip (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.98.70.100 cidr_netmask=24
  Operations: start interval=0s timeout=20s (virtual_ip-start-interval-0s)
  stop interval=0s timeout=20s (virtual_ip-stop-interval-0s)
  monitor interval=30s (virtual_ip-monitor-interval-30s)
 Resource: WebSite (class=ocf provider=heartbeat type=apache)
  Attributes: configfile=/etc/httpd/conf/httpd.conf 
statusurl=http://10.98.70.100/server-status

  Operations: start interval=0s timeout=40s (WebSite-start-interval-0s)
  stop interval=0s timeout=60s (WebSite-stop-interval-0s)
  monitor interval=1min (WebSite-monitor-interval-1min)
 Master: WebDataClone
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 
clone-node-max=1 notify=true

  Resource: WebData (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=r1
   Operations: start interval=0s timeout=240 (WebData-start-interval-0s)
   promote interval=0s timeout=90 
(WebData-promote-interval-0s)

   demote interval=0s timeout=90 (WebData-demote-interval-0s)
   stop interval=0s timeout=100 (WebData-stop-interval-0s)
   monitor interval=60s (WebData-monitor-interval-60s)
 Resource: WebFS (class=ocf provider=heartbeat type=Filesystem)
  Attributes: device=/dev/drbd1 directory=/var/www/html fstype=xfs
  Operations: start interval=0s timeout=60 (WebFS-start-interval-0s)
  stop interval=0s timeout=60 (WebFS-stop-interval-0s)
  monitor interval=20 timeout=40 (WebFS-monitor-interval-20)

Stonith Devices:
Fencing Levels:

Location Constraints:
Ordering Constraints:
  promote WebDataClone then start WebFS (kind:Mandatory) 
(id:order-WebDataClone-WebFS-mandatory)
  start WebFS then start virtual_ip (kind:Mandatory) 
(id:order-WebFS-virtual_ip-mandatory)
  start virtual_ip then start WebSite (kind:Mandatory) 
(id:order-virtual_ip-WebSite-mandatory)

Colocation Constraints:
  WebSite with virtual_ip (score:INFINITY) 
(id:colocation-WebSite-virtual_ip-INFINITY)
  WebFS with WebDataClone (score:INFINITY) (with-rsc-role:Master) 
(id:colocation-WebFS-WebDataClone-INFINITY)
  WebSite with WebFS (score:INFINITY) 
(id:colocation-WebSite-WebFS-INFINITY)


Resources Defaults:
 resource-stickiness: INFINITY
Operations Defaults:
 timeout: 240s

Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: web_cluster
 dc-version: 1.1.13-10.el7-44eb2dd
 default-resource-stickiness: INFINITY
 have-watchdog: false
 no-quorum-policy: ignore
 stonith-action: poweroff
 stonith-enabled: false



Regards,
Pratip Ghosh.

--
Thanks,

Pratip.
+91-9007515795

NOTICE: This e-mail and any attachment may contain confidential information 
that may be legally privileged. If you are not the intended recipient, you must 
not review, retransmit,
print, copy, use or disseminate it. Please immediately notify us by return 
e-mail and delete it.


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterl

Re: [ClusterLabs] crm_attribute bug in 1.1.15-rc1

2016-05-20 Thread Jehan-Guillaume de Rorthais
Le Fri, 20 May 2016 15:31:16 +0300,
Andrey Rogovsky  a écrit :

> Hi!
> I cant get attribute value:
> /usr/sbin/crm_attribute -q --type nodes --node-uname $HOSTNAME --attr-name
> master-pgsqld --get-value
> Error performing operation: No such device or address
> 
> This value is present:
> crm_mon -A1  | grep master-pgsqld
> + master-pgsqld: 1001
> + master-pgsqld: 1000
> + master-pgsqld: 1

Use crm_master to get master scores easily.

> I use 1.1.15-rc1
> dpkg -l | grep pacemaker-cli-utils
> ii  pacemaker-cli-utils1.1.15-rc1amd64
>Command line interface utilities for Pacemaker
> 
> Also non-integer values work file:
> /usr/sbin/crm_attribute -q --type nodes --node-uname $HOSTNAME --attr-name
> pgsql-data-status --get-value
> STREAMING|ASYNC

I'm very confused. It sounds you are mixing two different resource agent for
PostgreSQL. I can recognize scores for you master resource set bu the pgsqlms
RA (PAF project) and the data-status attribute from the pgsql RA...

> I thinking this patch
> https://github.com/ClusterLabs/pacemaker/commit/26d34a9171bddae67c56ebd8c2513ea8fa770204?diff=unified#diff-55bc49a57c12093902e3842ce349a71fR269
> is
> not apply in 1.1.15-rc1?
> 
> How I can get integere value from node attribute?

With the correct name for the given attribute.

Regards,
-- 
Jehan-Guillaume de Rorthais
Dalibo

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker not invoking monitor after $interval

2016-05-20 Thread Felix Zachlod (Lists)
> -Ursprüngliche Nachricht-
> Von: Jehan-Guillaume de Rorthais [mailto:j...@dalibo.com]
> Gesendet: Freitag, 20. Mai 2016 13:52
> An: Felix Zachlod (Lists) 
> Cc: users@clusterlabs.org
> Betreff: Re: [ClusterLabs] Pacemaker not invoking monitor after 
> $interval
> 
> Le Fri, 20 May 2016 11:33:39 +,
> "Felix Zachlod (Lists)"  a écrit :
> 
> > Hello!
> >
> > I am currently working on a cluster setup which includes several 
> > resources with "monitor interval=XXs" set. As far as I understand 
> > this should run the monitor action on the resource agent every XX 
> > seconds. But it seems it doesn't.
> 
> How do you know it doesn't? Are you looking at crm_mon? log files?

I created a debug output from my RA. Furthermore I had a blackbox dump.
But it now turned out, that for my resource I had to change meta-data to 
advertise 

monitor action twice (one for slave, one for master) and setup 

op monitor role=x interval=y instead of op monitor interval=x

Since that I changed it at least for this resource monitor is working as 
desired. At least for now. Not sure why a Master/Slave resource has to have 
distinct monitor actions advertised for both roles but it seems related to that.

Still don't see any monitor invocations in the log but seems there is still 
something wrong with the log level.

Thanks anyway!

regards, Felix
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] crm_attribute bug in 1.1.15-rc1

2016-05-20 Thread Andrey Rogovsky
Hi!
I cant get attribute value:
/usr/sbin/crm_attribute -q --type nodes --node-uname $HOSTNAME --attr-name
master-pgsqld --get-value
Error performing operation: No such device or address

This value is present:
crm_mon -A1  | grep master-pgsqld
+ master-pgsqld: 1001
+ master-pgsqld: 1000
+ master-pgsqld: 1

I use 1.1.15-rc1
dpkg -l | grep pacemaker-cli-utils
ii  pacemaker-cli-utils1.1.15-rc1amd64
   Command line interface utilities for Pacemaker

Also non-integer values work file:
/usr/sbin/crm_attribute -q --type nodes --node-uname $HOSTNAME --attr-name
pgsql-data-status --get-value
STREAMING|ASYNC

I thinking this patch
https://github.com/ClusterLabs/pacemaker/commit/26d34a9171bddae67c56ebd8c2513ea8fa770204?diff=unified#diff-55bc49a57c12093902e3842ce349a71fR269
is
not apply in 1.1.15-rc1?

How I can get integere value from node attribute?
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Pacemaker not invoking monitor after $interval

2016-05-20 Thread Ulrich Windl
>>> "Felix Zachlod (Lists)"  schrieb am 20.05.2016 um
13:33 in Nachricht
<670f732376b88843b8df7ad917cf8dd9289c0...@bulla.intern.onesty-tech.loc>:
> Hello!
> 
> I am currently working on a cluster setup which includes several resources 
> with "monitor interval=XXs" set. As far as I understand this should run the

> monitor action on the resource agent every XX seconds. But it seems it 
> doesn't. Actually monitor is only invoked in special condition, e.g.
cleanup, 
> start and so on, but never for a running (or stopped) resource. So it won't

> detect any resource failures, unless a manual action takes place. It won't 
> update master preference either when set in the monitor action.
> 
> Are there any special conditions under which the monitor will not be 
> executed? (Cluster IS managed though)
> 
> property cib-bootstrap-options: \
> have-watchdog=false \
> dc-version=1.1.13-10.el7_2.2-44eb2dd \
> cluster-infrastructure=corosync \
> cluster-name=sancluster \
> maintenance-mode=false \
> symmetric-cluster=false \
> last-lrm-refresh=1463739404 \
> stonith-enabled=true \
> stonith-action=reboot
> 
> Thank you in advance, regards, Felix

Try "crm_mon -1Arfj" (or similar) and look into your logs "grep monitor ...".

> 
> --
> Mit freundlichen Grüßen
> Dipl. Inf. (FH) Felix Zachlod
> 
> Onesty Tech GmbH
> Lieberoser Str. 7
> 03046 Cottbus
> 
> Tel.: +49 (355) 289430
> Fax.: +49 (355) 28943100
> f...@onesty-tech.de 
> 
> Registergericht Amtsgericht Cottbus, HRB 7885 Geschäftsführer Romy Schötz, 
> Thomas Menzel
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users 
> 
> Project Home: http://www.clusterlabs.org 
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> Bugs: http://bugs.clusterlabs.org 




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker not invoking monitor after $interval

2016-05-20 Thread Jehan-Guillaume de Rorthais
Le Fri, 20 May 2016 11:33:39 +,
"Felix Zachlod (Lists)"  a écrit :

> Hello!
> 
> I am currently working on a cluster setup which includes several resources
> with "monitor interval=XXs" set. As far as I understand this should run the
> monitor action on the resource agent every XX seconds. But it seems it
> doesn't. 

How do you know it doesn't? Are you looking at crm_mon? log files?

If you are looking at crm_mon, the output will not be updated unless some
changes are applied to the CIB or a transition is in progress.

> Actually monitor is only invoked in special condition, e.g. cleanup,
> start and so on, but never for a running (or stopped) resource. So it won't
> detect any resource failures, unless a manual action takes place. It won't
> update master preference either when set in the monitor action.
> 
> Are there any special conditions under which the monitor will not be
> executed?

Could you provide us with your Pacemaker setup?

> (Cluster IS managed though)

Resources can be unmanaged individually as well.

Regards,
-- 
Jehan-Guillaume de Rorthais
Dalibo

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Pacemaker not invoking monitor after $interval

2016-05-20 Thread Felix Zachlod (Lists)
Hello!

I am currently working on a cluster setup which includes several resources with 
"monitor interval=XXs" set. As far as I understand this should run the monitor 
action on the resource agent every XX seconds. But it seems it doesn't. 
Actually monitor is only invoked in special condition, e.g. cleanup, start and 
so on, but never for a running (or stopped) resource. So it won't detect any 
resource failures, unless a manual action takes place. It won't update master 
preference either when set in the monitor action.

Are there any special conditions under which the monitor will not be executed? 
(Cluster IS managed though)

property cib-bootstrap-options: \
have-watchdog=false \
dc-version=1.1.13-10.el7_2.2-44eb2dd \
cluster-infrastructure=corosync \
cluster-name=sancluster \
maintenance-mode=false \
symmetric-cluster=false \
last-lrm-refresh=1463739404 \
stonith-enabled=true \
stonith-action=reboot

Thank you in advance, regards, Felix

--
Mit freundlichen Grüßen
Dipl. Inf. (FH) Felix Zachlod

Onesty Tech GmbH
Lieberoser Str. 7
03046 Cottbus

Tel.: +49 (355) 289430
Fax.: +49 (355) 28943100
f...@onesty-tech.de

Registergericht Amtsgericht Cottbus, HRB 7885 Geschäftsführer Romy Schötz, 
Thomas Menzel



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Pacemaker reload Master/Slave resource

2016-05-20 Thread Felix Zachlod (Lists)
version 1.1.13-10.el7_2.2-44eb2dd

Hello!

I am currently developing a master/slave resource agent. So far it is working 
just fine, but this resource agent implements reload() and this does not work 
as expected when running as Master:
The reload action is invoked and it succeeds returning 0. The resource is still 
Master and monitor will return $OCF_RUNNING_MASTER.

But Pacemaker considers the instance being slave afterwards. Actually only 
reload is invoked, no monitor, no demote etc.

I first thought that reload should possibly return $OCF_RUNNING_MASTER too but 
this leads to the resource failing on reload. It seems 0 is the only valid 
return code.

I can recover the cluster state running resource $resourcename promote, which 
will call

notify
promote
notify

Afterwards my resource is considered Master again. After  PEngine Recheck Timer 
(I_PE_CALC) just popped (90ms), the cluster manager will promote the 
resource itself.
But this can lead to unexpected results, it could promote the resource on the 
wrong node so that both sides are actually running master, the cluster will not 
even notice it does not call monitor either.

Is this a bug?

regards, Felix


trace   May 20 12:58:31 cib_create_op(609):0: Sending call options: 0010, 
1048576
trace   May 20 12:58:31 cib_native_perform_op_delegate(384):0: Sending 
cib_modify message to CIB service (timeout=120s)
trace   May 20 12:58:31 crm_ipc_send(1175):0: Sending from client: cib_shm 
request id: 745 bytes: 1070 timeout:12 msg...
trace   May 20 12:58:31 crm_ipc_send(1188):0: Message sent, not waiting for 
reply to 745 from cib_shm to 1070 bytes...
trace   May 20 12:58:31 cib_native_perform_op_delegate(395):0: Reply: No data 
to dump as XML
trace   May 20 12:58:31 cib_native_perform_op_delegate(398):0: Async call, 
returning 268
trace   May 20 12:58:31 do_update_resource(2274):0: Sent resource state update 
message: 268 for reload=0 on scst_dg_ssd
trace   May 20 12:58:31 cib_client_register_callback_full(606):0: Adding 
callback cib_rsc_callback for call 268
trace   May 20 12:58:31 process_lrm_event(2374):0: Op scst_dg_ssd_reload_0 
(call=449, stop-id=scst_dg_ssd:449, remaining=3): Confirmed
notice  May 20 12:58:31 process_lrm_event(2392):0: Operation 
scst_dg_ssd_reload_0: ok (node=alpha, call=449, rc=0, cib-update=268, 
confirmed=true)
debug   May 20 12:58:31 update_history_cache(196):0: Updating history for 
'scst_dg_ssd' with reload op
trace   May 20 12:58:31 crm_ipc_read(992):0: No message from lrmd received: 
Resource temporarily unavailable
trace   May 20 12:58:31 mainloop_gio_callback(654):0: Message acquisition from 
lrmd[0x22b0ec0] failed: No message of desired type (-42)
trace   May 20 12:58:31 crm_fsa_trigger(293):0: Invoked (queue len: 0)
trace   May 20 12:58:31 s_crmd_fsa(159):0: FSA invoked with Cause: 
C_FSA_INTERNAL   State: S_NOT_DC
trace   May 20 12:58:31 s_crmd_fsa(246):0: Exiting the FSA
trace   May 20 12:58:31 crm_fsa_trigger(295):0: Exited  (queue len: 0)
trace   May 20 12:58:31 crm_ipc_read(989):0: Received cib_shm event 2108, 
size=183, rc=183, text: http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-05-20 Thread Jehan-Guillaume de Rorthais
Le Fri, 20 May 2016 11:12:28 +0200,
"Ulrich Windl"  a écrit :

> >>> Jehan-Guillaume de Rorthais  schrieb am 20.05.2016 um
> 09:59 in
> Nachricht <20160520095934.029c1822@firost>:
> > Le Fri, 20 May 2016 08:39:42 +0200,
> > "Ulrich Windl"  a écrit :
> > 
> >> >>> Jehan-Guillaume de Rorthais  schrieb am 19.05.2016 um
> >> >>> 21:29 in
> >> Nachricht <20160519212947.6cc0fd7b@firost>:
> >> [...]
> >> > I was thinking of a use case where a graceful demote or stop action
> failed
> >> > multiple times and to give a chance to the RA to choose another method to
> 
> >> > stop
> >> > the resource before it requires a migration. As instance, PostgreSQL has
> 3
> >> > different kind of stop, the last one being not graceful, but still better
> 
> >> > than
> >> > a kill -9.
> >> 
> >> For example the Xen RA tries a clean shutdown with a timeout of about 2/3
> of
> >> the timeout; it it fails it shuts the VM down the hard way.
> > 
> > Reading the Xen RA, I see they added a shutdown timeout escalation 
> > parameter.
> 
> Not quite:
> if [ -n "$OCF_RESKEY_shutdown_timeout" ]; then
>   timeout=$OCF_RESKEY_shutdown_timeout
> elif [ -n "$OCF_RESKEY_CRM_meta_timeout" ]; then
>   # Allow 2/3 of the action timeout for the orderly shutdown
>   # (The origin unit is ms, hence the conversion)
>   timeout=$((OCF_RESKEY_CRM_meta_timeout/1500))
> else
>   timeout=60
> fi
> 
> > This is a reasonable solution, but isn't it possible to get the action 
> > timeout
> > directly? I looked for such information in the past with no success.
> 
> See above.

Gosh, this is embarrassing...how could we miss that?

Thank you for pointing this!

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-05-20 Thread Ulrich Windl
>>> Jehan-Guillaume de Rorthais  schrieb am 20.05.2016 um
09:59 in
Nachricht <20160520095934.029c1822@firost>:
> Le Fri, 20 May 2016 08:39:42 +0200,
> "Ulrich Windl"  a écrit :
> 
>> >>> Jehan-Guillaume de Rorthais  schrieb am 19.05.2016 um
>> >>> 21:29 in
>> Nachricht <20160519212947.6cc0fd7b@firost>:
>> [...]
>> > I was thinking of a use case where a graceful demote or stop action
failed
>> > multiple times and to give a chance to the RA to choose another method to

>> > stop
>> > the resource before it requires a migration. As instance, PostgreSQL has
3
>> > different kind of stop, the last one being not graceful, but still better

>> > than
>> > a kill -9.
>> 
>> For example the Xen RA tries a clean shutdown with a timeout of about 2/3
of
>> the timeout; it it fails it shuts the VM down the hard way.
> 
> Reading the Xen RA, I see they added a shutdown timeout escalation 
> parameter.

Not quite:
if [ -n "$OCF_RESKEY_shutdown_timeout" ]; then
  timeout=$OCF_RESKEY_shutdown_timeout
elif [ -n "$OCF_RESKEY_CRM_meta_timeout" ]; then
  # Allow 2/3 of the action timeout for the orderly shutdown
  # (The origin unit is ms, hence the conversion)
  timeout=$((OCF_RESKEY_CRM_meta_timeout/1500))
else
  timeout=60
fi

> This is a reasonable solution, but isn't it possible to get the action 
> timeout
> directly? I looked for such information in the past with no success.

See above.

> 
>> 
>> I don't know Postgres in detail, but I could imagine a three step
approach:
>> 1) Shutdown after current operations have finished
>> 2) Shutdown regardless of pending operations (doing rollbacks)
>> 3) Shutdown the hard way, requiring recovery on the next start (I think in
>> Oracle this is called a "shutdown abort")
> 
> Exactly.
> 
>> Depending on the scenario one may start at step 2)
> 
> Indeed.
>  
>> [...]
>> I think RAs should not rely on "stop" being called multiple times for a
>> resource to be stopped.
> 
> Ok, so the RA should take care of their own escalation during a single 
> action.
> 
> Thanks, 

Regards,
Ulrich



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-05-20 Thread Klaus Wenninger
On 05/20/2016 08:39 AM, Ulrich Windl wrote:
 Jehan-Guillaume de Rorthais  schrieb am 19.05.2016 um 
 21:29 in
> Nachricht <20160519212947.6cc0fd7b@firost>:
> [...]
>> I was thinking of a use case where a graceful demote or stop action failed
>> multiple times and to give a chance to the RA to choose another method to 
>> stop
>> the resource before it requires a migration. As instance, PostgreSQL has 3
>> different kind of stop, the last one being not graceful, but still better 
>> than
>> a kill -9.
> For example the Xen RA tries a clean shutdown with a timeout of about 2/3 of 
> the timeout; it it fails it shuts the VM down the hard way.
>
> I don't know Postgres in detail, but I could imagine a three step approach:
> 1) Shutdown after current operations have finished
> 2) Shutdown regardless of pending operations (doing rollbacks)
> 3) Shutdown the hard way, requiring recovery on the next start (I think in 
> Oracle this is called a "shutdown abort")
>
> Depending on the scenario one may start at step 2)
>
> [...]
> I think RAs should not rely on "stop" being called multiple times for a 
> resource to be stopped.

I see a couple of positive points in having something inside pacemaker
that helps the RAs escalating
their stop strategy:

- this way you have the same logging for all RAs - done within the RA it
would look different with each of them
- timeout-retry stuff is potentially prone to not being implemented
properly - like this you have a proven
  implementation within pacemaker
- keeps logic within RA simpler and guides implementation in a certain
direction that makes them look
  more similar to each other making it easier to understand an RA you
haven't seen before

Of course there are basically two approaches to achieve this:

- give some global or per resource view of pacemaker to the RA and leave
it to the RA to act in a
  responsible manner (like telling the RA that there are x stop-retries
to come)
- handle the escalation withing pacemaker and already tell the RA what
you expect it to do
  like requesting a graceful / hard / emergency or however you would
call it stop
 
>
> Regards,
> Ulrich
>
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?

2016-05-20 Thread Jehan-Guillaume de Rorthais
Le Fri, 20 May 2016 08:39:42 +0200,
"Ulrich Windl"  a écrit :

> >>> Jehan-Guillaume de Rorthais  schrieb am 19.05.2016 um
> >>> 21:29 in
> Nachricht <20160519212947.6cc0fd7b@firost>:
> [...]
> > I was thinking of a use case where a graceful demote or stop action failed
> > multiple times and to give a chance to the RA to choose another method to 
> > stop
> > the resource before it requires a migration. As instance, PostgreSQL has 3
> > different kind of stop, the last one being not graceful, but still better 
> > than
> > a kill -9.
> 
> For example the Xen RA tries a clean shutdown with a timeout of about 2/3 of
> the timeout; it it fails it shuts the VM down the hard way.

Reading the Xen RA, I see they added a shutdown timeout escalation parameter.
This is a reasonable solution, but isn't it possible to get the action timeout
directly? I looked for such information in the past with no success.

> 
> I don't know Postgres in detail, but I could imagine a three step approach:
> 1) Shutdown after current operations have finished
> 2) Shutdown regardless of pending operations (doing rollbacks)
> 3) Shutdown the hard way, requiring recovery on the next start (I think in
> Oracle this is called a "shutdown abort")

Exactly.

> Depending on the scenario one may start at step 2)

Indeed.
 
> [...]
> I think RAs should not rely on "stop" being called multiple times for a
> resource to be stopped.

Ok, so the RA should take care of their own escalation during a single action.

Thanks, 

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org