On Tue, Apr 8, 2008 at 6:45 AM, Lars Marowsky-Bree <[EMAIL PROTECTED]> wrote:
> On 2008-02-27T20:39:13, Keisuke MORI <[EMAIL PROTECTED]> wrote:
>
>  Hi Keisuke-san,
>
>  thanks for your patch and contribution. I have to apologize in the name
>  of everyone for the late feedback.
>
>  I really appreciate the idea of monitoring processes directly, and
>  receiving async failure notifications to reduce fail-over times.
>
>  I have just discussed this with Dejan and Andrew, and we think that the
>  best path forward, alas necessary before inclusion, is to
>
>  - Make procd independent of Pacemaker. It should talk only to the RAs
>   and the LRM.
>
>  - RAs should "sign in" with it for the processes they want monitored,
>   instead of listing the processes in the procd configuration section
>   (means it gets decoupled from the CIB further). The RAs could write a
>   record to /var/run/heartbeat/procd/<resource-id>, for example.
>
>   The RAs would add/remove the required processes on start/promote or
>   demote/stop. (So procd itself would not need to be master-slave.)

It's a little bit unclear here. An RA will have to decide if it wants
its processes be monitored? Based on what principles then? Will it be
mandatory for all RAs to "sign in" for that addtional monitoring?

>
>   I'm afraid that having users manually specify process lists in the CIB
>   really is not workable - the users will not be able to get this
>   right.
>
>  - Instead of respawning procd, there should be a resource agent which
>   starts/stops (and monitors!) procd. You already have one, but why
>   doesn't it go into resources/OCF/ ?
>
>  - procd should talk to the LRM to insert a "fake" failed resource
>   action, which would then cause the CRM/PE to handle the resource as
>   failed and initiate recovery. (This is not currently possible with the
>   LRM client library; you could exec crm_resource -F, which would mean
>   you no longer have a build-time dependency on the CRM.)
>
>  - This would have the advantage of decoupling procd from pacemaker as
>   well as heartbeat. It could be included with the LRM/RA package build,
>   and possibly be useful with other cluster managers too.
>
>  I think all that would help simplify the code.
>
>
>  > +#define RSCID_LEN      128 /* ref. include/lrm/lrm_api.h */
>  > +#define MAX_PID_LEN    256 /* ref. lrm/lrmd/lrmd.h */
>  > +#define MAX_LISTEN_NUM 10 /* ref. lib/clplumbing/ipcsocket.c */
>
>  If you're referencing from other include files, please do include the
>  includes as to avoid diverging header definitions.
>
>
>  Regards,
>     Lars
>
>  --
>  Teamlead Kernel, SuSE Labs, Research and Development
>  SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
>  "Experience is the name everyone gives to their mistakes." -- Oscar Wilde
>
>  _______________________________________________________
>  Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>  http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>  Home Page: http://linux-ha.org/
>



-- 
Serge Dubrouski.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to