Re: [ClusterLabs] Pacemaker not invoking monitor after $interval
> -Ursprüngliche Nachricht- > Von: Jehan-Guillaume de Rorthais [mailto:j...@dalibo.com] > Gesendet: Freitag, 20. Mai 2016 13:52 > An: Felix Zachlod (Lists) > Cc: users@clusterlabs.org > Betreff: Re: [ClusterLabs] Pacemaker not invoking monitor after $interval > > Le Fri, 20 May 2016 11:33:39 +, > "Felix Zachlod (Lists)" a écrit : > > > Hello! > > > > I am currently working on a cluster setup which includes several resources > > with "monitor interval=XXs" set. As far as I understand this should run the > > monitor action on the resource agent every XX seconds. But it seems it > > doesn't. > > How do you know it doesn't? Are you looking at crm_mon? log files? I created a debug output from my RA. Furthermore I had a blackbox dump. But it now turned out, that for my resource I had to change meta-data to advertise monitor action twice (one for slave, one for master) and setup op monitor role=x interval=y instead of op monitor interval=x Since that I changed it at least for this resource monitor is working as desired. At least for now. Not sure why a Master/Slave resource has to have distinct monitor actions advertised for both roles but it seems related to that. Still don't see any monitor invocations in the log but seems there is still something wrong with the log level. Thanks anyway! regards, Felix ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] cluster stops randomly
Hi, I have a cluster and it works good, but I see sometimes cluster is stopped on all nodes and I should start manually. pcsd service is running but cluster is stopped.I see the pacemaker log but I couldn't find any warning or error. what is the issue? (stonith is disable.) Regards,H.Yavari ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Node attributes
Thank you. I used this and it works. Regards. From: Ken Gaillot To: users@clusterlabs.org Sent: Thursday, 19 May 2016, 19:34:08 Subject: Re: [ClusterLabs] Node attributes On 05/18/2016 10:49 PM, H Yavari wrote: > Hi, > > How can I define a constraint for two resource based on one nodes > attribute? > > For example resource X and Y are co-located based on node attribute Z. > > > > Regards, > H.Yavari Hi, See http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#idm140617356537136 High-level tools such as pcs and crm provide a simpler interface, but the concepts will be the same. This works for location constraints, not colocation, but you can easily accomplish what you want. If your goal is that X and Y each can only run on a node with attribute Z, then set up a location constraint for each one using the appropriate rule. If you goal is that X and Y must be colocated together, on a node with attribute Z, then set up a regular colocation constraint between them, and a location constraint for one of them with the appropriate rule; or, put them in a group, and set up a location constraint for the group with the appropriate rule. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Resource seems to not obey constraint
On 05/20/2016 10:29 AM, Leon Botes wrote: > I push the following config. > The iscsi-target fails as it tries to start on iscsiA-node1 > This is because I have no target installed on iscsiA-node1 which is by > design. All services listed here should only start on iscsiA-san1 > iscsiA-san2. > I am using using the iscsiA-node1 basically for quorum and some other > minor functions. > > Can someone please show me where I am going wrong? > All services should start on the same node, order is drbd-master > vip-blue vip-green iscsi-target iscsi-lun > > pcs -f ha_config property set symmetric-cluster="true" > pcs -f ha_config property set no-quorum-policy="stop" > pcs -f ha_config property set stonith-enabled="false" > pcs -f ha_config resource defaults resource-stickiness="200" > > pcs -f ha_config resource create drbd ocf:linbit:drbd drbd_resource=r0 > op monitor interval=60s > pcs -f ha_config resource master drbd master-max=1 master-node-max=1 > clone-max=2 clone-node-max=1 notify=true > pcs -f ha_config resource create vip-blue ocf:heartbeat:IPaddr2 > ip=192.168.101.100 cidr_netmask=32 nic=blue op monitor interval=20s > pcs -f ha_config resource create vip-green ocf:heartbeat:IPaddr2 > ip=192.168.102.100 cidr_netmask=32 nic=green op monitor interval=20s > pcs -f ha_config resource create iscsi-target ocf:heartbeat:iSCSITarget > params iqn="iqn.2016-05.trusc.net" implementation="lio-t" op monitor > interval="30s" > pcs -f ha_config resource create iscsi-lun > ocf:heartbeat:iSCSILogicalUnit params target_iqn="iqn.2016-05.trusc.net" > lun="1" path="/dev/drbd0" > > pcs -f ha_config constraint colocation add vip-blue drbd-master INFINITY > with-rsc-role=Master > pcs -f ha_config constraint colocation add vip-green drbd-master > INFINITY with-rsc-role=Master > > pcs -f ha_config constraint location drbd-master prefers stor-san1=500 > pcs -f ha_config constraint location drbd-master avoids stor-node1=INFINITY The above constraint is an example of how to ban a resource from a node. However stor-node1 is not a valid node name in your setup (maybe an earlier design?), so this particular constraint won't have any effect. If you want to ban certain resources from iscsiA-node1, add constraints like the above for each resource, using the correct node name. > pcs -f ha_config constraint order promote drbd-master then start vip-blue > pcs -f ha_config constraint order start vip-blue then start vip-green > pcs -f ha_config constraint order start vip-green then start iscsi-target > pcs -f ha_config constraint order start iscsi-target then start iscsi-lun > > Results: > > [root@san1 ~]# pcs status > Cluster name: storage_cluster > Last updated: Fri May 20 17:21:10 2016 Last change: Fri May 20 > 17:19:43 2016 by root via cibadmin on iscsiA-san1 > Stack: corosync > Current DC: iscsiA-san1 (version 1.1.13-10.el7_2.2-44eb2dd) - partition > with quorum > 3 nodes and 6 resources configured > > Online: [ iscsiA-node1 iscsiA-san1 iscsiA-san2 ] > > Full list of resources: > > Master/Slave Set: drbd-master [drbd] > Masters: [ iscsiA-san1 ] > Slaves: [ iscsiA-san2 ] > vip-blue (ocf::heartbeat:IPaddr2): Started iscsiA-san1 > vip-green (ocf::heartbeat:IPaddr2): Started iscsiA-san1 > iscsi-target (ocf::heartbeat:iSCSITarget): FAILED iscsiA-node1 > (unmanaged) > iscsi-lun (ocf::heartbeat:iSCSILogicalUnit): Stopped > > Failed Actions: > * drbd_monitor_0 on iscsiA-node1 'not installed' (5): call=6, status=Not > installed, exitreason='none', > last-rc-change='Fri May 20 17:19:44 2016', queued=0ms, exec=0ms > * iscsi-target_stop_0 on iscsiA-node1 'not installed' (5): call=24, > status=complete, exitreason='Setup problem: couldn't find command: > targetcli', > last-rc-change='Fri May 20 17:19:45 2016', queued=0ms, exec=18ms > * iscsi-lun_monitor_0 on iscsiA-node1 'not installed' (5): call=22, > status=complete, exitreason='Undefined iSCSI target implementation', > last-rc-change='Fri May 20 17:19:44 2016', queued=0ms, exec=27ms The above failures will still occur even if you add the proper constraints, because these are probes. Before starting a resource, Pacemaker probes it on all nodes, to make sure it's not already running somewhere. You can prevent this when you know it is impossible that the resource could be running on a particular node, by adding resource-discovery=never when creating the constraint banning it from that node. > > PCSD Status: > iscsiA-san1: Online > iscsiA-san2: Online > iscsiA-node1: Online > > Daemon Status: > corosync: active/disabled > pacemaker: active/disabled > pcsd: active/disabled > ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Issue in resource constraints and fencing - RHEL7 - AWS EC2
On 05/20/2016 10:02 AM, Pratip Ghosh wrote: > Hi All, > > I am implementing 2 node RedHat (RHEL 7.2) HA cluster on Amazon EC2 > instance. For floating IP I am using a shell script provided by AWS so > that virtual IP float to another instance if any one server failed with > health check. In basic level cluster is working but I have 2 issues on > that which I describe in bellow. > > ISSUE 1 > = > Now I need to configure fencing/STONITH to avoid split brain scenario in > storage cluster. I want to use multi-primari (Active/Active) DRBD in my > cluster for distributed storage. Is it possible to configure power > fencing on AWS EC2 instance? Can any one please guide me on this? There has been some discussion about this on this list before -- see http://search.gmane.org/?query=ec2&group=gmane.comp.clustering.clusterlabs.user Basically, there is an outdated agent available at https://github.com/beekhof/fence_ec2 and a newer fork of it in the (RHEL-incompatible) cluster-glue package. So with some work you may be able to get something working. > > ISSUE2 > = > Currently I am using single primary DRBD distributed storage. I added > cluster resources so that if any cluster node goes down then another > cluster node will promoted DRBD volume as primary and mount it on > /var/www/html. > > This configuration is working but for only if cluster node1 goes down. > If cluster node2 goes down all cluster resources fails over towards > cluster node1 but whenever cluster node2 again become on-line then > virtual_ip (cluster ip) ownership automatically goes towards cluster > node2 again. All the remaining resources not failed over like that. In > that case secondary IP stays with Node1 and ownership goes to Node2. > > I think this is an issue with resource stickiness or resource constraint > but here I am totally clueless. Can any one please help me on this? > > > My cluster details: > === > > [root@drbd01 ~]# pcs config > Cluster Name: web_cluster > Corosync Nodes: > ec2-52-24-8-124.us-west-2.compute.amazonaws.com > ec2-52-27-70-12.us-west-2.compute.amazonaws.com > Pacemaker Nodes: > ec2-52-24-8-124.us-west-2.compute.amazonaws.com > ec2-52-27-70-12.us-west-2.compute.amazonaws.com > > Resources: > Resource: virtual_ip (class=ocf provider=heartbeat type=IPaddr2) > Attributes: ip=10.98.70.100 cidr_netmask=24 > Operations: start interval=0s timeout=20s (virtual_ip-start-interval-0s) > stop interval=0s timeout=20s (virtual_ip-stop-interval-0s) > monitor interval=30s (virtual_ip-monitor-interval-30s) > Resource: WebSite (class=ocf provider=heartbeat type=apache) > Attributes: configfile=/etc/httpd/conf/httpd.conf > statusurl=http://10.98.70.100/server-status > Operations: start interval=0s timeout=40s (WebSite-start-interval-0s) > stop interval=0s timeout=60s (WebSite-stop-interval-0s) > monitor interval=1min (WebSite-monitor-interval-1min) > Master: WebDataClone > Meta Attrs: master-max=1 master-node-max=1 clone-max=2 > clone-node-max=1 notify=true > Resource: WebData (class=ocf provider=linbit type=drbd) >Attributes: drbd_resource=r1 >Operations: start interval=0s timeout=240 (WebData-start-interval-0s) >promote interval=0s timeout=90 (WebData-promote-interval-0s) >demote interval=0s timeout=90 (WebData-demote-interval-0s) >stop interval=0s timeout=100 (WebData-stop-interval-0s) >monitor interval=60s (WebData-monitor-interval-60s) > Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) > Attributes: device=/dev/drbd1 directory=/var/www/html fstype=xfs > Operations: start interval=0s timeout=60 (WebFS-start-interval-0s) > stop interval=0s timeout=60 (WebFS-stop-interval-0s) > monitor interval=20 timeout=40 (WebFS-monitor-interval-20) > > Stonith Devices: > Fencing Levels: > > Location Constraints: > Ordering Constraints: > promote WebDataClone then start WebFS (kind:Mandatory) > (id:order-WebDataClone-WebFS-mandatory) > start WebFS then start virtual_ip (kind:Mandatory) > (id:order-WebFS-virtual_ip-mandatory) > start virtual_ip then start WebSite (kind:Mandatory) > (id:order-virtual_ip-WebSite-mandatory) > Colocation Constraints: > WebSite with virtual_ip (score:INFINITY) > (id:colocation-WebSite-virtual_ip-INFINITY) > WebFS with WebDataClone (score:INFINITY) (with-rsc-role:Master) > (id:colocation-WebFS-WebDataClone-INFINITY) > WebSite with WebFS (score:INFINITY) > (id:colocation-WebSite-WebFS-INFINITY) > > Resources Defaults: > resource-stickiness: INFINITY You don't have any constraints requiring virtual_ip to stay with any other resource. So it doesn't. You could colocate virtual_ip with WebFS, and drop the colocation of WebSite with WebFS, but it would probably be easier to configure a group with WebFS, virtual_ip, WebSite, and WebFS. Then you would only need promote WebDataClone then start the
Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?
Klaus Wenninger wrote: > On 05/20/2016 08:39 AM, Ulrich Windl wrote: > Jehan-Guillaume de Rorthais schrieb am 19.05.2016 um > 21:29 in > > Nachricht <20160519212947.6cc0fd7b@firost>: > > [...] > >> I was thinking of a use case where a graceful demote or stop action failed > >> multiple times and to give a chance to the RA to choose another method to > >> stop > >> the resource before it requires a migration. As instance, PostgreSQL has 3 > >> different kind of stop, the last one being not graceful, but still better > >> than > >> a kill -9. > > > > For example the Xen RA tries a clean shutdown with a timeout of > > about 2/3 of the timeout; it it fails it shuts the VM down the > > hard way. > > > > I don't know Postgres in detail, but I could imagine a three step approach: > > 1) Shutdown after current operations have finished > > 2) Shutdown regardless of pending operations (doing rollbacks) > > 3) Shutdown the hard way, requiring recovery on the next start (I think in > > Oracle this is called a "shutdown abort") > > > > Depending on the scenario one may start at step 2) > > > > [...] > > I think RAs should not rely on "stop" being called multiple times for a > > resource to be stopped. Well, this would be a major architectural change. Currently if stop fails once, the node gets fenced - period. So if we changed this, there would presumably be quite a bit of scope for making the new design address whatever concerns you have about relying on "stop" *sometimes* needing to be called multiple times. For the sake of backwards compatibility with existing RAs, I think we'd have to ensure the current semantics still work. But maybe there could be a new option where RAs are allowed to return OCF_RETRY_STOP to indicate that they want to escalate, or something. However it's not clear how that would be distinguished from an old RA returning the same value as whatever we chose for OCF_RETRY_STOP. > I see a couple of positive points in having something inside pacemaker > that helps the RAs escalating > their stop strategy: > > - this way you have the same logging for all RAs - done within the RA it > would look different with each of them > - timeout-retry stuff is potentially prone to not being implemented > properly - like this you have a proven > implementation within pacemaker > - keeps logic within RA simpler and guides implementation in a certain > direction that makes them look > more similar to each other making it easier to understand an RA you > haven't seen before Yes, all good points which I agree with. > Of course there are basically two approaches to achieve this: > > - give some global or per resource view of pacemaker to the RA and leave > it to the RA to act in a > responsible manner (like telling the RA that there are x stop-retries > to come) > - handle the escalation withing pacemaker and already tell the RA what > you expect it to do > like requesting a graceful / hard / emergency or however you would > call it stop I'd probably prefer the former, to avoid hardcoding any assumptions about the different levels of escalation the RA might want to take. That would almost certainly vary per RA. However, we're slightly off-topic for this thread at this point ;-) ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Informing RAs about recovery: failed resource recovery, or any start-stop cycle?
Ken Gaillot wrote: > A recent thread discussed a proposed new feature, a new environment > variable that would be passed to resource agents, indicating whether a > stop action was part of a recovery. > > Since that thread was long and covered a lot of topics, I'm starting a > new one to focus on the core issue remaining: > > The original idea was to pass the number of restarts remaining before > the resource will no longer tried to be started on the same node. This > involves calculating (fail-count - migration-threshold), and that > implies certain limitations: (1) it will only be set when the cluster > checks migration-threshold; (2) it will only be set for the failed > resource itself, not for other resources that may be recovered due to > dependencies on it. > > Ulrich Windl proposed an alternative: setting a boolean value instead. I > forgot to cc the list on my reply, so I'll summarize now: We would set a > new variable like OCF_RESKEY_CRM_recovery=true whenever a start is > scheduled after a stop on the same node in the same transition. This > would avoid the corner cases of the previous approach; instead of being > tied to migration-threshold, it would be set whenever a recovery was > being attempted, for any reason. And with this approach, it should be > easier to set the variable for all actions on the resource > (demote/stop/start/promote), rather than just the stop. > > I think the boolean approach fits all the envisioned use cases that have > been discussed. Any objections to going that route instead of the count? I think that sounds fine to me. Thanks! ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: Re: FR: send failcount to OCF RA start/stop actions
Ken Gaillot wrote: > On 05/12/2016 06:21 AM, Adam Spiers wrote: > > Ken Gaillot wrote: > >> On 05/10/2016 02:29 AM, Ulrich Windl wrote: > Here is what I'm testing currently: > > - When the cluster recovers a resource, the resource agent's stop action > will get a new variable, OCF_RESKEY_CRM_meta_recovery_left = > migration-threshold - fail-count on the local node. [snipped] > > I'd prefer plural (OCF_RESKEY_CRM_meta_recoveries_left) but other than > > that I think it's good. OCF_RESKEY_CRM_meta_retries_left is shorter; > > not sure whether it's marginally worse or better though. > > I'm now leaning to restart_remaining (restarts_remaining would be just > as good). restarts_remaining would be better IMHO, given that it's expected that often multiple restarts will be remaining. [snipped] > > OK, so the RA code would typically be something like this? > > > > if [ ${OCF_RESKEY_CRM_meta_retries_left:-0} = 0 ]; then > > # This is the final stop, so tell the external service > > # not to send any more work our way. > > disable_service > > fi > > I'd use -eq :) but yes Right, -eq is better style for numeric comparison :-) [snipped] > -- If a resource is being recovered, but the fail-count is being cleared > in the same transition, the cluster will ignore migration-threshold (and > the variable will not be set). The RA might see recovery_left=5, 4, 3, > then someone clears the fail-count, and it won't see recovery_left even > though there is a stop and start being attempted. > > > > Hmm. So how would the RA distinguish that case from the one where > > the stop is final? > > That's the main question in all this. There are quite a few scenarios > where there's no meaningful distinction between 0 and unset. With the > current implementation at least, the ideal approach is for the RA to > treat the last stop before a restart the same as a final stop. OK ... [snipped] > > So IIUC, you are talking about a scenario like this: > > > > 1. The whole group starts fine. > > 2. Some time later, the neutron openvswitch agent crashes. > > 3. Pacemaker shuts down nova-compute since it depends upon > >the neutron agent due to being later in the same group. > > 4. Pacemaker repeatedly tries to start the neutron agent, > >but reaches migration-threshold. > > > > At this point, nova-compute is permanently down, but its RA never got > > passed OCF_RESKEY_CRM_meta_retries_left with a value of 0 or unset, > > so it never knew to do a nova service-disable. > > Basically right, but it would be unset (not empty -- it's never empty). > > However, this is a solvable issue. If it's important, I can add the > variable to all siblings of the failed resource if the entire group > would be forced away. Good to hear. > > (BTW, in this scenario, the group is actually cloned, so no migration > > to another compute node happens.) > > Clones are the perfect example of the lack of distinction between 0 and > unset. For an anonymous clone running on all nodes, the countdown will > be 3,2,1,unset because the specific clone instance doesn't need to be > started anywhere else (it looks more like a final stop of that > instance). But for unique clones, or anonymous clones where another node > is available to run the instance, it might be 0. I see, thanks. > > Did I get that right? If so, yes it does sound like an issue. Maybe > > it is possible to avoid this problem by avoiding the use of groups, > > and instead just use interleaved clones with ordering constraints > > between them? > > That's not any better, and in fact it would be more difficult to add the > variable to the dependent resource in such a situation, compared to a group. > > Generally, only the failed resource will get the variable, not resources > that may be stopped and started because they depend on the failed > resource in some way. OK. So that might be a problem for you guys than for us, since we use cloned groups, and you don't: https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/high-availability-for-compute-instances/chapter-1-use-high-availability-to-protect-instances > >> More generally, I suppose the point is to better support services that > >> can do a lesser tear-down for a stop-start cycle than a full stop. The > >> distinction between the two cases may not be 100% clear (as with your > >> fencing example), but the idea is that it would be used for > >> optimization, not some required behavior. > > > > This discussion is prompting me to get this clearer in my head, which > > is good :-) > > > > I suppose we *could* simply modify the existing NovaCompute OCF RA so > > that every time it executes the 'stop' action, it immediately sends > > the service-disable message to nova-api, and similarly send > > service-enable during the 'start' action. However this probably has a > > few downsides: > > > > 1. It could cause rapid flapping o
[ClusterLabs] Resource seems to not obey constraint
I push the following config. The iscsi-target fails as it tries to start on iscsiA-node1 This is because I have no target installed on iscsiA-node1 which is by design. All services listed here should only start on iscsiA-san1 iscsiA-san2. I am using using the iscsiA-node1 basically for quorum and some other minor functions. Can someone please show me where I am going wrong? All services should start on the same node, order is drbd-master vip-blue vip-green iscsi-target iscsi-lun pcs -f ha_config property set symmetric-cluster="true" pcs -f ha_config property set no-quorum-policy="stop" pcs -f ha_config property set stonith-enabled="false" pcs -f ha_config resource defaults resource-stickiness="200" pcs -f ha_config resource create drbd ocf:linbit:drbd drbd_resource=r0 op monitor interval=60s pcs -f ha_config resource master drbd master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true pcs -f ha_config resource create vip-blue ocf:heartbeat:IPaddr2 ip=192.168.101.100 cidr_netmask=32 nic=blue op monitor interval=20s pcs -f ha_config resource create vip-green ocf:heartbeat:IPaddr2 ip=192.168.102.100 cidr_netmask=32 nic=green op monitor interval=20s pcs -f ha_config resource create iscsi-target ocf:heartbeat:iSCSITarget params iqn="iqn.2016-05.trusc.net" implementation="lio-t" op monitor interval="30s" pcs -f ha_config resource create iscsi-lun ocf:heartbeat:iSCSILogicalUnit params target_iqn="iqn.2016-05.trusc.net" lun="1" path="/dev/drbd0" pcs -f ha_config constraint colocation add vip-blue drbd-master INFINITY with-rsc-role=Master pcs -f ha_config constraint colocation add vip-green drbd-master INFINITY with-rsc-role=Master pcs -f ha_config constraint location drbd-master prefers stor-san1=500 pcs -f ha_config constraint location drbd-master avoids stor-node1=INFINITY pcs -f ha_config constraint order promote drbd-master then start vip-blue pcs -f ha_config constraint order start vip-blue then start vip-green pcs -f ha_config constraint order start vip-green then start iscsi-target pcs -f ha_config constraint order start iscsi-target then start iscsi-lun Results: [root@san1 ~]# pcs status Cluster name: storage_cluster Last updated: Fri May 20 17:21:10 2016 Last change: Fri May 20 17:19:43 2016 by root via cibadmin on iscsiA-san1 Stack: corosync Current DC: iscsiA-san1 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 3 nodes and 6 resources configured Online: [ iscsiA-node1 iscsiA-san1 iscsiA-san2 ] Full list of resources: Master/Slave Set: drbd-master [drbd] Masters: [ iscsiA-san1 ] Slaves: [ iscsiA-san2 ] vip-blue (ocf::heartbeat:IPaddr2): Started iscsiA-san1 vip-green (ocf::heartbeat:IPaddr2): Started iscsiA-san1 iscsi-target (ocf::heartbeat:iSCSITarget): FAILED iscsiA-node1 (unmanaged) iscsi-lun (ocf::heartbeat:iSCSILogicalUnit): Stopped Failed Actions: * drbd_monitor_0 on iscsiA-node1 'not installed' (5): call=6, status=Not installed, exitreason='none', last-rc-change='Fri May 20 17:19:44 2016', queued=0ms, exec=0ms * iscsi-target_stop_0 on iscsiA-node1 'not installed' (5): call=24, status=complete, exitreason='Setup problem: couldn't find command: targetcli', last-rc-change='Fri May 20 17:19:45 2016', queued=0ms, exec=18ms * iscsi-lun_monitor_0 on iscsiA-node1 'not installed' (5): call=22, status=complete, exitreason='Undefined iSCSI target implementation', last-rc-change='Fri May 20 17:19:44 2016', queued=0ms, exec=27ms PCSD Status: iscsiA-san1: Online iscsiA-san2: Online iscsiA-node1: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled -- Regards Leon ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Issue in resource constraints and fencing - RHEL7 - AWS EC2
Hi All, I am implementing 2 node RedHat (RHEL 7.2) HA cluster on Amazon EC2 instance. For floating IP I am using a shell script provided by AWS so that virtual IP float to another instance if any one server failed with health check. In basic level cluster is working but I have 2 issues on that which I describe in bellow. ISSUE 1 = Now I need to configure fencing/STONITH to avoid split brain scenario in storage cluster. I want to use multi-primari (Active/Active) DRBD in my cluster for distributed storage. Is it possible to configure power fencing on AWS EC2 instance? Can any one please guide me on this? ISSUE2 = Currently I am using single primary DRBD distributed storage. I added cluster resources so that if any cluster node goes down then another cluster node will promoted DRBD volume as primary and mount it on /var/www/html. This configuration is working but for only if cluster node1 goes down. If cluster node2 goes down all cluster resources fails over towards cluster node1 but whenever cluster node2 again become on-line then virtual_ip (cluster ip) ownership automatically goes towards cluster node2 again. All the remaining resources not failed over like that. In that case secondary IP stays with Node1 and ownership goes to Node2. I think this is an issue with resource stickiness or resource constraint but here I am totally clueless. Can any one please help me on this? My cluster details: === [root@drbd01 ~]# pcs config Cluster Name: web_cluster Corosync Nodes: ec2-52-24-8-124.us-west-2.compute.amazonaws.com ec2-52-27-70-12.us-west-2.compute.amazonaws.com Pacemaker Nodes: ec2-52-24-8-124.us-west-2.compute.amazonaws.com ec2-52-27-70-12.us-west-2.compute.amazonaws.com Resources: Resource: virtual_ip (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=10.98.70.100 cidr_netmask=24 Operations: start interval=0s timeout=20s (virtual_ip-start-interval-0s) stop interval=0s timeout=20s (virtual_ip-stop-interval-0s) monitor interval=30s (virtual_ip-monitor-interval-30s) Resource: WebSite (class=ocf provider=heartbeat type=apache) Attributes: configfile=/etc/httpd/conf/httpd.conf statusurl=http://10.98.70.100/server-status Operations: start interval=0s timeout=40s (WebSite-start-interval-0s) stop interval=0s timeout=60s (WebSite-stop-interval-0s) monitor interval=1min (WebSite-monitor-interval-1min) Master: WebDataClone Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true Resource: WebData (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=r1 Operations: start interval=0s timeout=240 (WebData-start-interval-0s) promote interval=0s timeout=90 (WebData-promote-interval-0s) demote interval=0s timeout=90 (WebData-demote-interval-0s) stop interval=0s timeout=100 (WebData-stop-interval-0s) monitor interval=60s (WebData-monitor-interval-60s) Resource: WebFS (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd1 directory=/var/www/html fstype=xfs Operations: start interval=0s timeout=60 (WebFS-start-interval-0s) stop interval=0s timeout=60 (WebFS-stop-interval-0s) monitor interval=20 timeout=40 (WebFS-monitor-interval-20) Stonith Devices: Fencing Levels: Location Constraints: Ordering Constraints: promote WebDataClone then start WebFS (kind:Mandatory) (id:order-WebDataClone-WebFS-mandatory) start WebFS then start virtual_ip (kind:Mandatory) (id:order-WebFS-virtual_ip-mandatory) start virtual_ip then start WebSite (kind:Mandatory) (id:order-virtual_ip-WebSite-mandatory) Colocation Constraints: WebSite with virtual_ip (score:INFINITY) (id:colocation-WebSite-virtual_ip-INFINITY) WebFS with WebDataClone (score:INFINITY) (with-rsc-role:Master) (id:colocation-WebFS-WebDataClone-INFINITY) WebSite with WebFS (score:INFINITY) (id:colocation-WebSite-WebFS-INFINITY) Resources Defaults: resource-stickiness: INFINITY Operations Defaults: timeout: 240s Cluster Properties: cluster-infrastructure: corosync cluster-name: web_cluster dc-version: 1.1.13-10.el7-44eb2dd default-resource-stickiness: INFINITY have-watchdog: false no-quorum-policy: ignore stonith-action: poweroff stonith-enabled: false Regards, Pratip Ghosh. -- Thanks, Pratip. +91-9007515795 NOTICE: This e-mail and any attachment may contain confidential information that may be legally privileged. If you are not the intended recipient, you must not review, retransmit, print, copy, use or disseminate it. Please immediately notify us by return e-mail and delete it. ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterl
Re: [ClusterLabs] crm_attribute bug in 1.1.15-rc1
Le Fri, 20 May 2016 15:31:16 +0300, Andrey Rogovsky a écrit : > Hi! > I cant get attribute value: > /usr/sbin/crm_attribute -q --type nodes --node-uname $HOSTNAME --attr-name > master-pgsqld --get-value > Error performing operation: No such device or address > > This value is present: > crm_mon -A1 | grep master-pgsqld > + master-pgsqld: 1001 > + master-pgsqld: 1000 > + master-pgsqld: 1 Use crm_master to get master scores easily. > I use 1.1.15-rc1 > dpkg -l | grep pacemaker-cli-utils > ii pacemaker-cli-utils1.1.15-rc1amd64 >Command line interface utilities for Pacemaker > > Also non-integer values work file: > /usr/sbin/crm_attribute -q --type nodes --node-uname $HOSTNAME --attr-name > pgsql-data-status --get-value > STREAMING|ASYNC I'm very confused. It sounds you are mixing two different resource agent for PostgreSQL. I can recognize scores for you master resource set bu the pgsqlms RA (PAF project) and the data-status attribute from the pgsql RA... > I thinking this patch > https://github.com/ClusterLabs/pacemaker/commit/26d34a9171bddae67c56ebd8c2513ea8fa770204?diff=unified#diff-55bc49a57c12093902e3842ce349a71fR269 > is > not apply in 1.1.15-rc1? > > How I can get integere value from node attribute? With the correct name for the given attribute. Regards, -- Jehan-Guillaume de Rorthais Dalibo ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Pacemaker not invoking monitor after $interval
> -Ursprüngliche Nachricht- > Von: Jehan-Guillaume de Rorthais [mailto:j...@dalibo.com] > Gesendet: Freitag, 20. Mai 2016 13:52 > An: Felix Zachlod (Lists) > Cc: users@clusterlabs.org > Betreff: Re: [ClusterLabs] Pacemaker not invoking monitor after > $interval > > Le Fri, 20 May 2016 11:33:39 +, > "Felix Zachlod (Lists)" a écrit : > > > Hello! > > > > I am currently working on a cluster setup which includes several > > resources with "monitor interval=XXs" set. As far as I understand > > this should run the monitor action on the resource agent every XX > > seconds. But it seems it doesn't. > > How do you know it doesn't? Are you looking at crm_mon? log files? I created a debug output from my RA. Furthermore I had a blackbox dump. But it now turned out, that for my resource I had to change meta-data to advertise monitor action twice (one for slave, one for master) and setup op monitor role=x interval=y instead of op monitor interval=x Since that I changed it at least for this resource monitor is working as desired. At least for now. Not sure why a Master/Slave resource has to have distinct monitor actions advertised for both roles but it seems related to that. Still don't see any monitor invocations in the log but seems there is still something wrong with the log level. Thanks anyway! regards, Felix ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] crm_attribute bug in 1.1.15-rc1
Hi! I cant get attribute value: /usr/sbin/crm_attribute -q --type nodes --node-uname $HOSTNAME --attr-name master-pgsqld --get-value Error performing operation: No such device or address This value is present: crm_mon -A1 | grep master-pgsqld + master-pgsqld: 1001 + master-pgsqld: 1000 + master-pgsqld: 1 I use 1.1.15-rc1 dpkg -l | grep pacemaker-cli-utils ii pacemaker-cli-utils1.1.15-rc1amd64 Command line interface utilities for Pacemaker Also non-integer values work file: /usr/sbin/crm_attribute -q --type nodes --node-uname $HOSTNAME --attr-name pgsql-data-status --get-value STREAMING|ASYNC I thinking this patch https://github.com/ClusterLabs/pacemaker/commit/26d34a9171bddae67c56ebd8c2513ea8fa770204?diff=unified#diff-55bc49a57c12093902e3842ce349a71fR269 is not apply in 1.1.15-rc1? How I can get integere value from node attribute? ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: Pacemaker not invoking monitor after $interval
>>> "Felix Zachlod (Lists)" schrieb am 20.05.2016 um 13:33 in Nachricht <670f732376b88843b8df7ad917cf8dd9289c0...@bulla.intern.onesty-tech.loc>: > Hello! > > I am currently working on a cluster setup which includes several resources > with "monitor interval=XXs" set. As far as I understand this should run the > monitor action on the resource agent every XX seconds. But it seems it > doesn't. Actually monitor is only invoked in special condition, e.g. cleanup, > start and so on, but never for a running (or stopped) resource. So it won't > detect any resource failures, unless a manual action takes place. It won't > update master preference either when set in the monitor action. > > Are there any special conditions under which the monitor will not be > executed? (Cluster IS managed though) > > property cib-bootstrap-options: \ > have-watchdog=false \ > dc-version=1.1.13-10.el7_2.2-44eb2dd \ > cluster-infrastructure=corosync \ > cluster-name=sancluster \ > maintenance-mode=false \ > symmetric-cluster=false \ > last-lrm-refresh=1463739404 \ > stonith-enabled=true \ > stonith-action=reboot > > Thank you in advance, regards, Felix Try "crm_mon -1Arfj" (or similar) and look into your logs "grep monitor ...". > > -- > Mit freundlichen Grüßen > Dipl. Inf. (FH) Felix Zachlod > > Onesty Tech GmbH > Lieberoser Str. 7 > 03046 Cottbus > > Tel.: +49 (355) 289430 > Fax.: +49 (355) 28943100 > f...@onesty-tech.de > > Registergericht Amtsgericht Cottbus, HRB 7885 Geschäftsführer Romy Schötz, > Thomas Menzel > > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Pacemaker not invoking monitor after $interval
Le Fri, 20 May 2016 11:33:39 +, "Felix Zachlod (Lists)" a écrit : > Hello! > > I am currently working on a cluster setup which includes several resources > with "monitor interval=XXs" set. As far as I understand this should run the > monitor action on the resource agent every XX seconds. But it seems it > doesn't. How do you know it doesn't? Are you looking at crm_mon? log files? If you are looking at crm_mon, the output will not be updated unless some changes are applied to the CIB or a transition is in progress. > Actually monitor is only invoked in special condition, e.g. cleanup, > start and so on, but never for a running (or stopped) resource. So it won't > detect any resource failures, unless a manual action takes place. It won't > update master preference either when set in the monitor action. > > Are there any special conditions under which the monitor will not be > executed? Could you provide us with your Pacemaker setup? > (Cluster IS managed though) Resources can be unmanaged individually as well. Regards, -- Jehan-Guillaume de Rorthais Dalibo ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Pacemaker not invoking monitor after $interval
Hello! I am currently working on a cluster setup which includes several resources with "monitor interval=XXs" set. As far as I understand this should run the monitor action on the resource agent every XX seconds. But it seems it doesn't. Actually monitor is only invoked in special condition, e.g. cleanup, start and so on, but never for a running (or stopped) resource. So it won't detect any resource failures, unless a manual action takes place. It won't update master preference either when set in the monitor action. Are there any special conditions under which the monitor will not be executed? (Cluster IS managed though) property cib-bootstrap-options: \ have-watchdog=false \ dc-version=1.1.13-10.el7_2.2-44eb2dd \ cluster-infrastructure=corosync \ cluster-name=sancluster \ maintenance-mode=false \ symmetric-cluster=false \ last-lrm-refresh=1463739404 \ stonith-enabled=true \ stonith-action=reboot Thank you in advance, regards, Felix -- Mit freundlichen Grüßen Dipl. Inf. (FH) Felix Zachlod Onesty Tech GmbH Lieberoser Str. 7 03046 Cottbus Tel.: +49 (355) 289430 Fax.: +49 (355) 28943100 f...@onesty-tech.de Registergericht Amtsgericht Cottbus, HRB 7885 Geschäftsführer Romy Schötz, Thomas Menzel ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Pacemaker reload Master/Slave resource
version 1.1.13-10.el7_2.2-44eb2dd Hello! I am currently developing a master/slave resource agent. So far it is working just fine, but this resource agent implements reload() and this does not work as expected when running as Master: The reload action is invoked and it succeeds returning 0. The resource is still Master and monitor will return $OCF_RUNNING_MASTER. But Pacemaker considers the instance being slave afterwards. Actually only reload is invoked, no monitor, no demote etc. I first thought that reload should possibly return $OCF_RUNNING_MASTER too but this leads to the resource failing on reload. It seems 0 is the only valid return code. I can recover the cluster state running resource $resourcename promote, which will call notify promote notify Afterwards my resource is considered Master again. After PEngine Recheck Timer (I_PE_CALC) just popped (90ms), the cluster manager will promote the resource itself. But this can lead to unexpected results, it could promote the resource on the wrong node so that both sides are actually running master, the cluster will not even notice it does not call monitor either. Is this a bug? regards, Felix trace May 20 12:58:31 cib_create_op(609):0: Sending call options: 0010, 1048576 trace May 20 12:58:31 cib_native_perform_op_delegate(384):0: Sending cib_modify message to CIB service (timeout=120s) trace May 20 12:58:31 crm_ipc_send(1175):0: Sending from client: cib_shm request id: 745 bytes: 1070 timeout:12 msg... trace May 20 12:58:31 crm_ipc_send(1188):0: Message sent, not waiting for reply to 745 from cib_shm to 1070 bytes... trace May 20 12:58:31 cib_native_perform_op_delegate(395):0: Reply: No data to dump as XML trace May 20 12:58:31 cib_native_perform_op_delegate(398):0: Async call, returning 268 trace May 20 12:58:31 do_update_resource(2274):0: Sent resource state update message: 268 for reload=0 on scst_dg_ssd trace May 20 12:58:31 cib_client_register_callback_full(606):0: Adding callback cib_rsc_callback for call 268 trace May 20 12:58:31 process_lrm_event(2374):0: Op scst_dg_ssd_reload_0 (call=449, stop-id=scst_dg_ssd:449, remaining=3): Confirmed notice May 20 12:58:31 process_lrm_event(2392):0: Operation scst_dg_ssd_reload_0: ok (node=alpha, call=449, rc=0, cib-update=268, confirmed=true) debug May 20 12:58:31 update_history_cache(196):0: Updating history for 'scst_dg_ssd' with reload op trace May 20 12:58:31 crm_ipc_read(992):0: No message from lrmd received: Resource temporarily unavailable trace May 20 12:58:31 mainloop_gio_callback(654):0: Message acquisition from lrmd[0x22b0ec0] failed: No message of desired type (-42) trace May 20 12:58:31 crm_fsa_trigger(293):0: Invoked (queue len: 0) trace May 20 12:58:31 s_crmd_fsa(159):0: FSA invoked with Cause: C_FSA_INTERNAL State: S_NOT_DC trace May 20 12:58:31 s_crmd_fsa(246):0: Exiting the FSA trace May 20 12:58:31 crm_fsa_trigger(295):0: Exited (queue len: 0) trace May 20 12:58:31 crm_ipc_read(989):0: Received cib_shm event 2108, size=183, rc=183, text: http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?
Le Fri, 20 May 2016 11:12:28 +0200, "Ulrich Windl" a écrit : > >>> Jehan-Guillaume de Rorthais schrieb am 20.05.2016 um > 09:59 in > Nachricht <20160520095934.029c1822@firost>: > > Le Fri, 20 May 2016 08:39:42 +0200, > > "Ulrich Windl" a écrit : > > > >> >>> Jehan-Guillaume de Rorthais schrieb am 19.05.2016 um > >> >>> 21:29 in > >> Nachricht <20160519212947.6cc0fd7b@firost>: > >> [...] > >> > I was thinking of a use case where a graceful demote or stop action > failed > >> > multiple times and to give a chance to the RA to choose another method to > > >> > stop > >> > the resource before it requires a migration. As instance, PostgreSQL has > 3 > >> > different kind of stop, the last one being not graceful, but still better > > >> > than > >> > a kill -9. > >> > >> For example the Xen RA tries a clean shutdown with a timeout of about 2/3 > of > >> the timeout; it it fails it shuts the VM down the hard way. > > > > Reading the Xen RA, I see they added a shutdown timeout escalation > > parameter. > > Not quite: > if [ -n "$OCF_RESKEY_shutdown_timeout" ]; then > timeout=$OCF_RESKEY_shutdown_timeout > elif [ -n "$OCF_RESKEY_CRM_meta_timeout" ]; then > # Allow 2/3 of the action timeout for the orderly shutdown > # (The origin unit is ms, hence the conversion) > timeout=$((OCF_RESKEY_CRM_meta_timeout/1500)) > else > timeout=60 > fi > > > This is a reasonable solution, but isn't it possible to get the action > > timeout > > directly? I looked for such information in the past with no success. > > See above. Gosh, this is embarrassing...how could we miss that? Thank you for pointing this! ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: Re: Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?
>>> Jehan-Guillaume de Rorthais schrieb am 20.05.2016 um 09:59 in Nachricht <20160520095934.029c1822@firost>: > Le Fri, 20 May 2016 08:39:42 +0200, > "Ulrich Windl" a écrit : > >> >>> Jehan-Guillaume de Rorthais schrieb am 19.05.2016 um >> >>> 21:29 in >> Nachricht <20160519212947.6cc0fd7b@firost>: >> [...] >> > I was thinking of a use case where a graceful demote or stop action failed >> > multiple times and to give a chance to the RA to choose another method to >> > stop >> > the resource before it requires a migration. As instance, PostgreSQL has 3 >> > different kind of stop, the last one being not graceful, but still better >> > than >> > a kill -9. >> >> For example the Xen RA tries a clean shutdown with a timeout of about 2/3 of >> the timeout; it it fails it shuts the VM down the hard way. > > Reading the Xen RA, I see they added a shutdown timeout escalation > parameter. Not quite: if [ -n "$OCF_RESKEY_shutdown_timeout" ]; then timeout=$OCF_RESKEY_shutdown_timeout elif [ -n "$OCF_RESKEY_CRM_meta_timeout" ]; then # Allow 2/3 of the action timeout for the orderly shutdown # (The origin unit is ms, hence the conversion) timeout=$((OCF_RESKEY_CRM_meta_timeout/1500)) else timeout=60 fi > This is a reasonable solution, but isn't it possible to get the action > timeout > directly? I looked for such information in the past with no success. See above. > >> >> I don't know Postgres in detail, but I could imagine a three step approach: >> 1) Shutdown after current operations have finished >> 2) Shutdown regardless of pending operations (doing rollbacks) >> 3) Shutdown the hard way, requiring recovery on the next start (I think in >> Oracle this is called a "shutdown abort") > > Exactly. > >> Depending on the scenario one may start at step 2) > > Indeed. > >> [...] >> I think RAs should not rely on "stop" being called multiple times for a >> resource to be stopped. > > Ok, so the RA should take care of their own escalation during a single > action. > > Thanks, Regards, Ulrich ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?
On 05/20/2016 08:39 AM, Ulrich Windl wrote: Jehan-Guillaume de Rorthais schrieb am 19.05.2016 um 21:29 in > Nachricht <20160519212947.6cc0fd7b@firost>: > [...] >> I was thinking of a use case where a graceful demote or stop action failed >> multiple times and to give a chance to the RA to choose another method to >> stop >> the resource before it requires a migration. As instance, PostgreSQL has 3 >> different kind of stop, the last one being not graceful, but still better >> than >> a kill -9. > For example the Xen RA tries a clean shutdown with a timeout of about 2/3 of > the timeout; it it fails it shuts the VM down the hard way. > > I don't know Postgres in detail, but I could imagine a three step approach: > 1) Shutdown after current operations have finished > 2) Shutdown regardless of pending operations (doing rollbacks) > 3) Shutdown the hard way, requiring recovery on the next start (I think in > Oracle this is called a "shutdown abort") > > Depending on the scenario one may start at step 2) > > [...] > I think RAs should not rely on "stop" being called multiple times for a > resource to be stopped. I see a couple of positive points in having something inside pacemaker that helps the RAs escalating their stop strategy: - this way you have the same logging for all RAs - done within the RA it would look different with each of them - timeout-retry stuff is potentially prone to not being implemented properly - like this you have a proven implementation within pacemaker - keeps logic within RA simpler and guides implementation in a certain direction that makes them look more similar to each other making it easier to understand an RA you haven't seen before Of course there are basically two approaches to achieve this: - give some global or per resource view of pacemaker to the RA and leave it to the RA to act in a responsible manner (like telling the RA that there are x stop-retries to come) - handle the escalation withing pacemaker and already tell the RA what you expect it to do like requesting a graceful / hard / emergency or however you would call it stop > > Regards, > Ulrich > > > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: Re: Informing RAs about recovery: failed resource recovery, or any start-stop cycle?
Le Fri, 20 May 2016 08:39:42 +0200, "Ulrich Windl" a écrit : > >>> Jehan-Guillaume de Rorthais schrieb am 19.05.2016 um > >>> 21:29 in > Nachricht <20160519212947.6cc0fd7b@firost>: > [...] > > I was thinking of a use case where a graceful demote or stop action failed > > multiple times and to give a chance to the RA to choose another method to > > stop > > the resource before it requires a migration. As instance, PostgreSQL has 3 > > different kind of stop, the last one being not graceful, but still better > > than > > a kill -9. > > For example the Xen RA tries a clean shutdown with a timeout of about 2/3 of > the timeout; it it fails it shuts the VM down the hard way. Reading the Xen RA, I see they added a shutdown timeout escalation parameter. This is a reasonable solution, but isn't it possible to get the action timeout directly? I looked for such information in the past with no success. > > I don't know Postgres in detail, but I could imagine a three step approach: > 1) Shutdown after current operations have finished > 2) Shutdown regardless of pending operations (doing rollbacks) > 3) Shutdown the hard way, requiring recovery on the next start (I think in > Oracle this is called a "shutdown abort") Exactly. > Depending on the scenario one may start at step 2) Indeed. > [...] > I think RAs should not rely on "stop" being called multiple times for a > resource to be stopped. Ok, so the RA should take care of their own escalation during a single action. Thanks, ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org