Re: [ClusterLabs] Delayed first monitoring

Miloš Kozák Wed, 12 Aug 2015 09:21:07 -0700

I would rather agree with you. However, I dont have logs at hand toprove it... but that is what I saw in logs thus I formulated my questionas I did :D


Dne 12.8.2015 v 18:16 emmanuel segura napsal(a):

Sorry, but from my point of view, the agent first check if the
resource is running, for example you can check that from
/usr/lib/ocf/resource.d/heartbeat/Filesystem


The logic is

Filesystem::start(parameter as parameter for the
agent)->Filesystem_start(function called from start in the case which
evaluate the parameters) -> Filesystem_status(function called for the
previous one), If the fs is already mounted return success.

so you need to check if the resource is already started.

2015-08-12 16:14 GMT+02:00 Nekrasov, Alexander <alexander.nekra...@emc.com>:

1. Pacemaker will/may call a monitor before starting a resource, in which case 
it expects a NOT_RUNNING response. It's just checking assumptions at that point.

2. A resource::start must only return when resource::monitor is successful. 
Basically the logic of a start() must follow this:

start() {
   start_daemon()
   while ! monitor() ; do
       sleep some
   done
   return $OCF_SUCCESS
}

-----Original Message-----
From: Miloš Kozák [mailto:milos.ko...@lejmr.com]
Sent: Wednesday, August 12, 2015 10:03 AM
To: users@clusterlabs.org
Subject: [ClusterLabs] Delayed first monitoring

Hi,

I have set up and CoroSync+CMAN+Pacemaker at CentOS 6.5 in order to
provide high-availability of opennebula. However, I am facing to a
strange problem which raises from my lack of knowleadge..

In the log I can see that when I create a resource based on an init
script, typically:

pcs resource create httpd lsb:httpd

The httpd daemon gets started, but monitor is initiated at the same time
and the resource is identified as not running. This behaviour makes
sense since we realize that the daemon starting takes some time. In this
particular case, I get error code 2 which means that process is running,
but environment is not locked. The effect of this is that httpd resource
gets restarted.

My workaround is extra sleep in status function of the init script, but
I dont like this solution at all! Do you have idea how to tackle this
problem in a proper way? I expected an op attribut which would specify
delay after service start and first monitoring, but I could not find
it..

Thank you, Milos

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Delayed first monitoring

Reply via email to