Hi,

On Tue, Sep 04, 2012 at 07:20:23PM -0600, Alan Robertson wrote:
> Hi Dejan,
> 
> If the resource agent is not running correctly it needs to be 
> restarted.  My memory says that OCF_ERR_GENERIC will not cause that 
> behavior.  I believe the spec says you should exit with not running if 
> it is not functioning correctly.  (but I didn't check it, and my memory 
> isn't that clear in this case).

>From the OCF standard:

1       generic or unspecified error (current practice)
        The "monitor" operation shall return this for a crashed, hung
        or otherwise non-functional resource.
...
7       program is not running
        Note: This is not the error code to be returned by a
        successful "stop" operation. A successful "stop" operation
        shall return 0.
        The "monitor" action shall return this value only for a
        _cleanly_ stopped resource. If in doubt, it should return 1.

It also sounds OK to me.

> I will likely write a monitor-only resource agent for web servers.  What 
> would you think about calling it from the other web resource agents?

There's a bit of code in http-mon.sh, extracted from apache. It
offers some extended testing of web servers. The features are
described in README.webapps.

> This resource agent will not look at any config files, and will require 
> everything explicitly in parameters, and will not know how to start or 
> stop anything.

Somebody wanted to do a ping-like RA, i.e. setting attribute
based on the HTTP results. Unfortunately, one contributor gave up
and another wanted to do everything from scratch thus duplicating
parts of the code. What I'd like to see is a script handling
just CRM attributes. Then it would be easy to put together a
Dummy-like RA to make use of that one and say http-mon.sh.

> This would be for my new monitoring project, of course 
> ;-).  But it could then be called by all the HTTP resource agents - or 
> used directly - for example by the Assimilation project.
> 
> This would be a slight but useful bending of OCF resource agent APIs.  
> We could create some new metadata to document it, and also not put start 
> and stop into the actions in the operations section. Or just the latter.
> 
> What do you think?

Right now, there's a bunch of resource agents faking the state
(e.g. ping), that is pretending to be able to start and stop.
If we could somehow do without it, that would obviously be
beneficial. Not sure if/how the pacemaker could deal with such
agents.

Cheers,

Dejan

> 
> On 08/29/2012 05:31 AM, Dejan Muhamedagic wrote:
> > Hi Alan,
> >
> > On Mon, Aug 27, 2012 at 10:51:15AM -0600, Alan Robertson wrote:
> >> Hi,
> >>
> >> I was recently using the Apache resource agent, and discovered a few
> >> problems:
> >>
> >>       The exit code from grep was used directly as an OCF exit code.
> >>               It is NOT an OCF exit code, and should not be directly used
> >> in this way.
> > I guess you mean the greps in monitor_apache_extended and
> > monitor_apache_basic? These lines:
> >
> > 267   $whattorun "$test_url" | grep -Ei "$test_regex" > /dev/null
> > 277   ${ourhttpclient}_func "$STATUSURL" | grep -Ei "$TESTREGEX" > /dev/null
> >
> >>               This caused a "not running" error to become a generic error.
> > These lines are invoked _only_ in case it was previously
> > established that the apache server is running. So, they should
> > return OCF_ERR_GENERIC if the test fails. grep exits with code 1
> > which matches OCF_ERR_GENERIC. But indeed the OCF error code
> > should be returned explicitely.
> >
> >>               Pacemaker reacts very differently to the two kinds of errors.
> >>
> >>         This code occurred in two places.
> >>
> >> The resource agent used OCF_CHECK_LEVEL improperly.
> >>
> >> The specification says that if you receive an OCF_CHECK_LEVEL which you
> >> do not support, you are required to interpret it as the next lower
> >> supported value for OCF_CHECK_LEVEL.
> >>
> >> In effect, there are no invalid OCF_CHECK_LEVEL values.  The Apache
> >> agent declared all values but one to be errors.  This is not the correct
> >> behavior.
> > OK. That somehow slipped while I had been reading the OCF standard.
> >
> > BTW, it'd be great if nginx shared some code with apache. The
> > latter has already been split into three scripts.
> >
> > Cheers,
> >
> > Dejan
> >
> >> -- 
> >>       Alan Robertson <al...@unix.sh> - @OSSAlanR
> >>
> >> "Openness is the foundation and preservative of friendship...  Let me 
> >> claim from you at all times your undisguised opinions." - William 
> >> Wilberforce
> >> _______________________________________________________
> >> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> >> Home Page: http://linux-ha.org/
> > _______________________________________________________
> > Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> 
> 
> -- 
>      Alan Robertson <al...@unix.sh> - @OSSAlanR
> 
> "Openness is the foundation and preservative of friendship...  Let me claim 
> from you at all times your undisguised opinions." - William Wilberforce
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to