Hi,

On Tue, Apr 08, 2008 at 02:45:12PM +0200, Lars Marowsky-Bree wrote:
> On 2008-02-27T20:39:13, Keisuke MORI <[EMAIL PROTECTED]> wrote:
> 
> Hi Keisuke-san,
> 
> thanks for your patch and contribution. I have to apologize in the name
> of everyone for the late feedback.
> 
> I really appreciate the idea of monitoring processes directly, and
> receiving async failure notifications to reduce fail-over times.
> 
> I have just discussed this with Dejan and Andrew, and we think that the
> best path forward, alas necessary before inclusion, is to
> 
> - Make procd independent of Pacemaker. It should talk only to the RAs
>   and the LRM.
> 
> - RAs should "sign in" with it for the processes they want monitored,
>   instead of listing the processes in the procd configuration section
>   (means it gets decoupled from the CIB further). The RAs could write a
>   record to /var/run/heartbeat/procd/<resource-id>, for example. 
>   
>   The RAs would add/remove the required processes on start/promote or
>   demote/stop. (So procd itself would not need to be master-slave.)
> 
>   I'm afraid that having users manually specify process lists in the CIB
>   really is not workable - the users will not be able to get this
>   right.
> 
> - Instead of respawning procd, there should be a resource agent which
>   starts/stops (and monitors!) procd. You already have one, but why
>   doesn't it go into resources/OCF/ ?
> 
> - procd should talk to the LRM to insert a "fake" failed resource
>   action, which would then cause the CRM/PE to handle the resource as
>   failed and initiate recovery. (This is not currently possible with the
>   LRM client library; you could exec crm_resource -F, which would mean
>   you no longer have a build-time dependency on the CRM.)

This is going to be implemented in the LRM. Please see (and
subscribe) here:

http://developerbugs.linux-foundation.org/show_bug.cgi?id=1872

Cheers,

Dejan


> - This would have the advantage of decoupling procd from pacemaker as
>   well as heartbeat. It could be included with the LRM/RA package build,
>   and possibly be useful with other cluster managers too.
> 
> I think all that would help simplify the code.
> 
> 
> > +#define RSCID_LEN      128 /* ref. include/lrm/lrm_api.h */
> > +#define MAX_PID_LEN    256 /* ref. lrm/lrmd/lrmd.h */
> > +#define MAX_LISTEN_NUM 10 /* ref. lib/clplumbing/ipcsocket.c */
> 
> If you're referencing from other include files, please do include the
> includes as to avoid diverging header definitions.
> 
> 
> Regards,
>     Lars
> 
> -- 
> Teamlead Kernel, SuSE Labs, Research and Development
> SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG N?rnberg)
> "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
> 
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to