[ClusterLabs] pcs remove command doesn't work to remove monitor operations

2017-11-19 Thread jaspal singla
Hello Community,

Need some clarification on recent pacemaker version and some "op"
operations (remove/add).


Around a year back, I deployed pacemaker cluster using below pacemaker
versions and I was/am able to run "pcs resource op remove FSCheck monitor
interval=30s"

Pacemaker Version:
pacemaker-cli-1.1.15-11.el7.x86_64
pacemaker-libs-1.1.15-11.el7.x86_64
pacemaker-cluster-libs-1.1.15-11.el7.x86_64
pacemaker-1.1.15-11.el7.x86_64


But now I need to deploy pacemaker cluster again on some different setup
and using upgraded pacemaker version as listed:

Pacemaker Version:
pacemaker-libs-1.1.16-12.el7.x86_64
pacemaker-cli-1.1.16-12.el7.x86_64
pacemaker-1.1.16-12.el7.x86_64
pacemaker-cluster-libs-1.1.16-12.el7.x86_64


And strangely, pcs op remove command have stopped working and I get the
below error:

Error:
[root@ha2-105 HA7]# pcs resource op remove FSCheck monitor interval=15s
Error: Unable to find operation matching: monitor interval=15s



Has anything changed in newer version of pacemaker? Any insight will be
highly appreciable.

Thanks!!
___
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Entire Group stop on stopping of single Resource (Jan Pokorn?)

2016-08-23 Thread jaspal singla
Thanks,
Jaspal Singla

On Mon, Aug 22, 2016 at 7:42 PM, <users-requ...@clusterlabs.org> wrote:

> Send Users mailing list submissions to
> users@clusterlabs.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://clusterlabs.org/mailman/listinfo/users
> or, via email, send a message with subject or body 'help' to
> users-requ...@clusterlabs.org
>
> You can reach the person managing the list at
> users-ow...@clusterlabs.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Users digest..."
>
>
> Today's Topics:
>
>1. Re: Mysql slave did not start replication after failure, and
>   read-only IP also remained active on the much outdated slave
>   (Attila Megyeri)
>2. Re: Entire Group stop on stopping of single Resource (Jan Pokorn?)
>3. Re: Mysql slave did not start replication after failure, and
>   read-only IP also remained active on the much outdated slave
>   (Ken Gaillot)
>
>
> --
>
> Message: 1
> Date: Mon, 22 Aug 2016 14:24:28 +0200
> From: Attila Megyeri <amegy...@minerva-soft.com>
> To: Cluster Labs - All topics related to open-source clustering
> welcomed<users@clusterlabs.org>
> Subject: Re: [ClusterLabs] Mysql slave did not start replication after
> failure, and read-only IP also remained active on the much outdated
> slave
> Message-ID:
> <DA9AC973EEA03848B46F36E07B947E0403E5393E9922@DESRV05.
> minerva-soft.local>
>
> Content-Type: text/plain; charset="utf-8"
>
> Hi Andrei,
>
> I waited several hours, and nothing happened.
>
> I assume that the RA does not treat this case properly. Mysql was running,
> but the "show slave status" command returned something that the RA was not
> prepared to parse, and instead of reporting a non-readable attribute, it
> returned some generic error, that did not stop the server.
>
> Rgds,
> Attila
>
>
> -Original Message-
> From: Andrei Borzenkov [mailto:arvidj...@gmail.com]
> Sent: Monday, August 22, 2016 11:42 AM
> To: Cluster Labs - All topics related to open-source clustering welcomed <
> users@clusterlabs.org>
> Subject: Re: [ClusterLabs] Mysql slave did not start replication after
> failure, and read-only IP also remained active on the much outdated slave
>
> On Mon, Aug 22, 2016 at 12:18 PM, Attila Megyeri
> <amegy...@minerva-soft.com> wrote:
> > Dear community,
> >
> >
> >
> > A few days ago we had an issue in our Mysql M/S replication cluster.
> >
> > We have a one R/W Master, and a one RO Slave setup. RO VIP is supposed
> to be
> > running on the slave if it is not too much behind the master, and if any
> > error occurs, RO VIP is moved to the master.
> >
> >
> >
> > Something happened with the slave Mysql (some disk issue, still
> > investigating), but the problem is, that the slave VIP remained on the
> slave
> > device, even though the slave process was not running, and the server was
> > much outdated.
> >
> >
> >
> > During the issue the following log entries appeared (just an extract as
> it
> > would be too long):
> >
> >
> >
> >
> >
> > Aug 20 02:04:07 ctdb1 corosync[1056]:   [MAIN  ] Corosync main process
> was
> > not scheduled for 14088.5488 ms (threshold is 4000. ms). Consider
> token
> > timeout increase.
> >
> > Aug 20 02:04:07 ctdb1 corosync[1056]:   [TOTEM ] A processor failed,
> forming
> > new configuration.
> >
> > Aug 20 02:04:34 ctdb1 corosync[1056]:   [MAIN  ] Corosync main process
> was
> > not scheduled for 27065.2559 ms (threshold is 4000. ms). Consider
> token
> > timeout increase.
> >
> > Aug 20 02:04:34 ctdb1 corosync[1056]:   [TOTEM ] A new membership
> (xxx:6720)
> > was formed. Members left: 168362243 168362281 168362282 168362301
> 168362302
> > 168362311 168362312 1
> >
> > Aug 20 02:04:34 ctdb1 corosync[1056]:   [TOTEM ] A new membership
> (xxx:6724)
> > was formed. Members
> >
> > ..
> >
> > Aug 20 02:13:28 ctdb1 corosync[1056]:   [MAIN  ] Completed service
> > synchronization, ready to provide service.
> >
> > ..
> >
> > Aug 20 02:13:29 ctdb1 attrd[1584]:   notice: attrd_trigger_update:
> Sending
> > flush op to all hosts for: readable (1)
> >
> > ?
> >
> > Aug 20 02:13:32 ctdb1 mysql(db-mysql)[10492]: INFO: post-demote
> notification
> &g

[ClusterLabs] Entire Group stop on stopping of single Resource

2016-08-19 Thread jaspal singla
Hello Community,

I have an resource group (ctm_service) comprise of various resources. Now
the requirement is when one of its resource stops for soem time (10-20)
seconds, I want entire group will be stopped.

Is it possible to achieve this in pacemaker. Please help!

__

 Resource Group: ctm_service
 FSCheck
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/FsCheckAgent.py):
   (target-role:Stopped) Stopped
 NTW_IF (lsb:../../..//cisco/PrimeOpticalServer/HA/bin/NtwIFAgent.py):
 (target-role:Stopped) Stopped
 CTM_RSYNC  (lsb:../../..//cisco/PrimeOpticalServer/HA/bin/RsyncAgent.py):
 (target-role:Stopped) Stopped
 REPL_IF(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ODG_IFAgent.py):
(target-role:Stopped) Stopped
 ORACLE_REPLICATOR
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ODG_ReplicatorAgent.py):
(target-role:Stopped) Stopped
 CTM_SID(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/OracleAgent.py):
(target-role:Stopped) Stopped
 CTM_SRV(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/CtmAgent.py):
   (target-role:Stopped) Stopped
 CTM_APACHE (lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ApacheAgent.py):
(target-role:Stopped) Stopped

_


This is resource and resource group properties:


___

pcs -f cib.xml.geo resource create FSCheck lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/FsCheckAgent.py op monitor id=FSCheck-OP-monitor
name=monitor interval=30s
pcs -f cib.xml.geo resource create NTW_IF lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/NtwIFAgent.py op monitor id=NtwIFAgent-OP-monitor
name=monitor interval=30s
pcs -f cib.xml.geo resource create CTM_RSYNC lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/RsyncAgent.py op monitor id=CTM_RSYNC-OP-monitor
name=monitor interval=30s on-fail=ignore stop id=CTM_RSYNC-OP-stop
interval=0 on-fail=stop
pcs -f cib.xml.geo resource create REPL_IF lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/ODG_IFAgent.py op monitor id=REPL_IF-OP-monitor
name=monitor interval=30 on-fail=ignore stop id=REPL_IF-OP-stop interval=0
on-fail=stop
pcs -f cib.xml.geo resource create ORACLE_REPLICATOR lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/ODG_ReplicatorAgent.py op monitor
id=ORACLE_REPLICATOR-OP-monitor name=monitor interval=30s on-fail=ignore
stop id=ORACLE_REPLICATOR-OP-stop interval=0 on-fail=stop
pcs -f cib.xml.geo resource create CTM_SID lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/OracleAgent.py op monitor id=CTM_SID-OP-monitor
name=monitor interval=30s
pcs -f cib.xml.geo resource create CTM_SRV lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/CtmAgent.py op monitor id=CTM_SRV-OP-monitor
name=monitor interval=30s
pcs -f cib.xml.geo resource create CTM_APACHE lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/ApacheAgent.py op monitor
id=CTM_APACHE-OP-monitor name=monitor interval=30s
pcs -f cib.xml.geo resource create CTM_HEARTBEAT lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/HeartBeat.py op monitor
id=CTM_HEARTBEAT-OP-monitor name=monitor interval=30s
pcs -f cib.xml.geo resource create FLASHBACK  lsb:../../..//cisco/
PrimeOpticalServer/HA/bin/FlashBackMonitor.py op monitor
id=FLASHBACK-OP-monitor name=monitor interval=30s


pcs -f cib.xml.geo resource group add ctm_service FSCheck NTW_IF CTM_RSYNC
REPL_IF ORACLE_REPLICATOR CTM_SID CTM_SRV CTM_APACHE

pcs -f cib.xml.geo resource meta ctm_service migration-threshold=1
failure-timeout=10 target-role=stopped






Any help will be highly appreciated!

Thanks,
Jaspal Singla
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] (no subject)

2016-08-19 Thread jaspal singla
Hello Community,

I have an resource group (ctm_service) comprise of various resources. Now
the requirement is when one of its resource stops for soem time (10-20)
seconds, I want entire group will be stopped.

Is it possible to achieve this in pacemaker. Please help!

__

 Resource Group: ctm_service
 FSCheck
 (lsb:../../..//cisco/PrimeOpticalServer/HA/bin/FsCheckAgent.py):
 (target-role:Stopped) Stopped
 NTW_IF
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/NtwIFAgent.py):
 (target-role:Stopped) Stopped
 CTM_RSYNC
 (lsb:../../..//cisco/PrimeOpticalServer/HA/bin/RsyncAgent.py):
 (target-role:Stopped) Stopped
 REPL_IF
 (lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ODG_IFAgent.py):
(target-role:Stopped) Stopped
 ORACLE_REPLICATOR
 (lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ODG_ReplicatorAgent.py):
(target-role:Stopped) Stopped
 CTM_SID
 (lsb:../../..//cisco/PrimeOpticalServer/HA/bin/OracleAgent.py):
(target-role:Stopped) Stopped
 CTM_SRV
 (lsb:../../..//cisco/PrimeOpticalServer/HA/bin/CtmAgent.py):
 (target-role:Stopped) Stopped
 CTM_APACHE
(lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ApacheAgent.py):
(target-role:Stopped) Stopped
_


This is resource and resource group properties:

___

pcs -f cib.xml.geo resource create FSCheck
lsb:../../..//cisco/PrimeOpticalServer/HA/bin/FsCheckAgent.py op monitor
id=FSCheck-OP-monitor name=monitor interval=30s
pcs -f cib.xml.geo resource create NTW_IF
lsb:../../..//cisco/PrimeOpticalServer/HA/bin/NtwIFAgent.py op monitor
id=NtwIFAgent-OP-monitor name=monitor interval=30s
pcs -f cib.xml.geo resource create CTM_RSYNC
lsb:../../..//cisco/PrimeOpticalServer/HA/bin/RsyncAgent.py op monitor
id=CTM_RSYNC-OP-monitor name=monitor interval=30s on-fail=ignore stop
id=CTM_RSYNC-OP-stop interval=0 on-fail=stop
pcs -f cib.xml.geo resource create REPL_IF
lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ODG_IFAgent.py op monitor
id=REPL_IF-OP-monitor name=monitor interval=30 on-fail=ignore stop
id=REPL_IF-OP-stop interval=0 on-fail=stop
pcs -f cib.xml.geo resource create ORACLE_REPLICATOR
lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ODG_ReplicatorAgent.py op
monitor id=ORACLE_REPLICATOR-OP-monitor name=monitor interval=30s
on-fail=ignore stop id=ORACLE_REPLICATOR-OP-stop interval=0 on-fail=stop
pcs -f cib.xml.geo resource create CTM_SID
lsb:../../..//cisco/PrimeOpticalServer/HA/bin/OracleAgent.py op monitor
id=CTM_SID-OP-monitor name=monitor interval=30s
pcs -f cib.xml.geo resource create CTM_SRV
lsb:../../..//cisco/PrimeOpticalServer/HA/bin/CtmAgent.py op monitor
id=CTM_SRV-OP-monitor name=monitor interval=30s
pcs -f cib.xml.geo resource create CTM_APACHE
lsb:../../..//cisco/PrimeOpticalServer/HA/bin/ApacheAgent.py op monitor
id=CTM_APACHE-OP-monitor name=monitor interval=30s
pcs -f cib.xml.geo resource create CTM_HEARTBEAT
lsb:../../..//cisco/PrimeOpticalServer/HA/bin/HeartBeat.py op monitor
id=CTM_HEARTBEAT-OP-monitor name=monitor interval=30s
pcs -f cib.xml.geo resource create FLASHBACK
 lsb:../../..//cisco/PrimeOpticalServer/HA/bin/FlashBackMonitor.py op
monitor id=FLASHBACK-OP-monitor name=monitor interval=30s


pcs -f cib.xml.geo resource group add ctm_service FSCheck NTW_IF CTM_RSYNC
REPL_IF ORACLE_REPLICATOR CTM_SID CTM_SRV CTM_APACHE

pcs -f cib.xml.geo resource meta ctm_service migration-threshold=1
failure-timeout=10 target-role=stopped





Any help will be highly appreciated!

Thanks,
Jaspal Singla
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Freezing/Unfreezing in Pacemaker ?

2016-04-07 Thread jaspal singla
Hello,

As we have clusvcadm -U  and clusvcadm -Z 
 to freeze and unfreeze resource in CMAN. Would really appreciate if
someone please give some pointers for freezing/unfreezing a resource in
Pacemaker (pcs) as well.

Thanks,
Jaspal Singla
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Cluster resources migration from CMAN to Pacemaker

2016-02-09 Thread jaspal singla
e-ID: <20160130024803.ga27...@redhat.com>
> Content-Type: text/plain; charset="utf-8"
>
> On 27/01/16 19:41 +0100, Jan Pokorn? wrote:
> > On 27/01/16 11:04 -0600, Ken Gaillot wrote:
> >> On 01/27/2016 02:34 AM, jaspal singla wrote:
> >>> 1) In CMAN, there was meta attribute - autostart=0 (This parameter
> disables
> >>> the start of all services when RGManager starts). Is there any way for
> such
> >>> behavior in Pacemaker?
> >
> > Please be more careful about the descriptions; autostart=0 specified
> > at the given resource group ("service" or "vm" tag) means just not to
> > start anything contained in this very one automatically (also upon
> > new resources being defined, IIUIC), definitely not "all services".
> >
> > [...]
> >
> >> I don't think there's any exact replacement for autostart in pacemaker.
> >> Probably the closest is to set target-role=Stopped before stopping the
> >> cluster, and set target-role=Started when services are desired to be
> >> started.
>
> Beside is-managed=false (as currently used in clufter), I also looked
> at downright disabling "start" action, but this turned out to be a naive
> approach caused by unclear documentation.
>
> Pushing for a bit more clarity (hopefully):
> https://github.com/ClusterLabs/pacemaker/pull/905
>
> >>> 2) Please put some alternatives to exclusive=0 and
> __independent_subtree?
> >>> what we have in Pacemaker instead of these?
>
> (exclusive property discussed in the other subthread; as a recap,
> no extra effort is needed to achieve exclusive=0, exclusive=1 is
> currently a show stopper in clufter as neither approach is versatile
> enough)
>
> > For __independent_subtree, each component must be a separate pacemaker
> > resource, and the constraints between them would depend on exactly what
> > you were trying to accomplish. The key concepts here are ordering
> > constraints, colocation constraints, kind=Mandatory/Optional (for
> > ordering constraints), and ordered sets.
>
> Current approach in clufter as of the next branch:
> - __independent_subtree=1 -> do nothing special (hardly can be
>  improved?)
> - __independent_subtree=2 -> for that very resource, set operations
>  as follows:
>  monitor (interval=60s) on-fail=ignore
>  stop interval=0 on-fail=stop
>
> Groups carrying such resources are not unrolled into primitives plus
> contraints, as the above might suggest (also default kind=Mandatory
> for underlying order constraints should fit well).
>
> Please holler if this is not sound.
>
>
> So when put together with some other changes/fixes, current
> suggested/informative sequence of pcs commands goes like this:
>
> pcs cluster auth ha1-105.test.com
> pcs cluster setup --start --name HA1-105_CLUSTER ha1-105.test.com \
>   --consensus 12000 --token 1 --join 60
> sleep 60
> pcs cluster cib tmp-cib.xml --config
> pcs -f tmp-cib.xml property set stonith-enabled=false
> pcs -f tmp-cib.xml \
>   resource create RESOURCE-script-FSCheck \
>   lsb:../../..//data/Product/HA/bin/FsCheckAgent.py \
>   op monitor interval=30s
> pcs -f tmp-cib.xml \
>   resource create RESOURCE-script-NTW_IF \
>   lsb:../../..//data/Product/HA/bin/NtwIFAgent.py \
>   op monitor interval=30s
> pcs -f tmp-cib.xml \
>   resource create RESOURCE-script-CTM_RSYNC \
>   lsb:../../..//data/Product/HA/bin/RsyncAgent.py \
>   op monitor interval=30s on-fail=ignore stop interval=0 on-fail=stop
> pcs -f tmp-cib.xml \
>   resource create RESOURCE-script-REPL_IF \
>   lsb:../../..//data/Product/HA/bin/ODG_IFAgent.py \
>   op monitor interval=30s on-fail=ignore stop interval=0 on-fail=stop
> pcs -f tmp-cib.xml \
>   resource create RESOURCE-script-ORACLE_REPLICATOR \
>   lsb:../../..//data/Product/HA/bin/ODG_ReplicatorAgent.py \
>   op monitor interval=30s on-fail=ignore stop interval=0 on-fail=stop
> pcs -f tmp-cib.xml \
>   resource create RESOURCE-script-CTM_SID \
>   lsb:../../..//data/Product/HA/bin/OracleAgent.py \
>   op monitor interval=30s
> pcs -f tmp-cib.xml \
>   resource create RESOURCE-script-CTM_SRV \
>   lsb:../../..//data/Product/HA/bin/CtmAgent.py \
>   op monitor interval=30s
> pcs -f tmp-cib.xml \
>   resource create RESOURCE-script-CTM_APACHE \
>   lsb:../../..//data/Product/HA/bin/ApacheAgent.py \
>   op monitor interval=30s
> pcs -f tmp-cib.xml \
>   resource create RESOURCE-script-CTM_HEARTBEAT \
>   lsb:../../..//data/Product/HA/bin/HeartBeat.py \
>   op monitor inte

[ClusterLabs] Cluster resources migration from CMAN to Pacemaker

2016-01-22 Thread jaspal singla
Hello Everyone,


I desperately need some help in order to migrate my cluster configuration
from CMAN (RHEL-6.5) to PACEMAKER (RHEL-7.1).


I have tried to explore a lot but couldn't find similarities configuring
same resources (created in CMAN's cluster.conf file) to Pacemaker.



I'd like to share cluster.conf of RHEL-6.5 and want to achieve the same
thing through Pacemaker. Any help would be greatly appreciable!!



*Cluster.conf file*

##


































<script ref="CTM_SRV">
<script ref="CTM_APACHE"/>












###


* Quries/concerns:*

-> How can I specifically mentioned above 10 resources through Pacemaker?
-> the services being used in  section are not init.d services,
these services uses script reference of above defined resources. So, how
could I do the same thing in Pacemaker?
Couple of concerns I have:
-> How do I create failover domains in pacemaker and link resources to it?
-> By default there are several pre-defined resource API's given in
Pacemaker and we can use them if our requirements match with pre-defined
API's like IPADDR2, Apache etc. But what if I have some python scripts and
want to use those scripts as resources? Is their any way to do that?


Please Please help me to get this sorted.


Thanks,
Jaspal Singla
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org