Re: [Pacemaker] Resource is Too Active (on both nodes)

2013-03-25 Thread Andreas Kurz
On 2013-03-22 21:35, Mohica Jasha wrote:
 Hey,
 
 I have two cluster nodes.
 
 I have a service process which is prone to crash and takes a very long
 time to start. 
 Since the service process takes a long time to start I have the service
 process running on both nodes, but only the active node with the virtual
 IP serves the incoming requests.
 
 On both nodes, I have a cron job which periodically checks if the
 service process is up and if not it starts the service.
 
 I want pacemaker to periodically check if the service is down on the
 active node and if so, it switches the virtual IP to the second node
 (without starting or stopping the my service)
 
 I have the following configuration:
 
 primitive clusterIP ocf:heartbeat:IPaddr2 \
 params ip=10.0.1.247 \
 op monitor interval=10s timeout=20s
 
 primitive serviceMonitoring ocf:serviceMonitoring:serviceMonitoring 
 params op monitor interval=10s timeout=20s
 
 colocation HACluster inf: serviceMonitoring clusterIP
 order serviceMonitoring-after-clusterIP inf: clusterIP serviceMonitoring
 
 My serviceMonitoring resource doesn't do anything other than checking
 the state of the service process. I get the following in the log file:
 
 Mar 05 15:07:59 [1543] ha1 pengine:   notice: unpack_rsc_op: Operation
 monitor found resource serviceMonitoring active on ha2
 Mar 05 15:07:59 [1543] ha1 pengine:   notice: unpack_rsc_op: Operation
 monitor found resource serviceMonitoring active on ha1
 Mar 05 15:07:59 [1543] ha1 pengine:error: native_create_actions:
 Resource serviceMonitoring (ocf:: serviceMonitoring) is active on 2
 nodes attempting recovery
 Mar 05 15:07:59 [1543] ha1 pengine:  warning: native_create_actions: See
 http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information.
 
 So it seems that pacemaker calls the monitor method of the
 serviceMonitoring resource on both nodes.

Yes, it does a probing of the resources on all nodes ... clone your
serviceMonitoring resource and set it into unmanaged mode, that should
give you the desired behavior ... or simply clone it and let Pacemaker
do the complete management and go without your cron-check-restart magic.

Regards,
Andreas

 
 Any idea how I can fix this?
 
 Thanks,
 Mohica
 
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org
 


-- 
Need help with Pacemaker?
http://www.hastexo.com/now


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[Pacemaker] Resource is Too Active (on both nodes)

2013-03-22 Thread Mohica Jasha
Hey,

I have two cluster nodes.

I have a service process which is prone to crash and takes a very long time
to start.
Since the service process takes a long time to start I have the service
process running on both nodes, but only the active node with the virtual IP
serves the incoming requests.

On both nodes, I have a cron job which periodically checks if the service
process is up and if not it starts the service.

I want pacemaker to periodically check if the service is down on the active
node and if so, it switches the virtual IP to the second node (without
starting or stopping the my service)

I have the following configuration:

primitive clusterIP ocf:heartbeat:IPaddr2 \
params ip=10.0.1.247 \
op monitor interval=10s timeout=20s

primitive serviceMonitoring ocf:serviceMonitoring:serviceMonitoring
params op monitor interval=10s timeout=20s

colocation HACluster inf: serviceMonitoring clusterIP
order serviceMonitoring-after-clusterIP inf: clusterIP serviceMonitoring

My serviceMonitoring resource doesn't do anything other than checking the
state of the service process. I get the following in the log file:

Mar 05 15:07:59 [1543] ha1 pengine:   notice: unpack_rsc_op: Operation
monitor found resource serviceMonitoring active on ha2
Mar 05 15:07:59 [1543] ha1 pengine:   notice: unpack_rsc_op: Operation
monitor found resource serviceMonitoring active on ha1
Mar 05 15:07:59 [1543] ha1 pengine:error: native_create_actions:
Resource serviceMonitoring (ocf:: serviceMonitoring) is active on 2 nodes
attempting recovery
Mar 05 15:07:59 [1543] ha1 pengine:  warning: native_create_actions: See
http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information.

So it seems that pacemaker calls the monitor method of the
serviceMonitoring resource on both nodes.

Any idea how I can fix this?

Thanks,
Mohica
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org