On Fri, Jul 30, 2010 at 11:42 AM, Keisuke MORI
<keisuke.mori...@gmail.com> wrote:
> 2010/7/27 Andrew Beekhof <and...@beekhof.net>:
>> On Tue, Jul 27, 2010 at 8:44 AM, Keisuke MORI <keisuke.mori...@gmail.com> 
>> wrote:
>>> For heartbeat, I personally like "pacemaker on" in ha.cf :)
>>
>> One thing thats coming in 1.1.3 is an mcp (master control process) and
>> associated init script for pacemaker.
>> This means that Pacemaker is started/stopped independently of the
>> messaging layer.
>>
>> Currently this is only written for corosync[1], but I've been toying
>> with the idea of extending it to Heartbeat.
>> In which case, if you're already changing the option, you might want
>> to make it: legacy on/off
>> Where "off" would be the equivalent of starting with -M (no resource
>> management) but wouldn't spawn any daemons.
>>
>> Thoughts?
>
> I have a several concerns with that change,
>
> 1) Is it possible to recover or cause a fail-over correctly when any
> of the Pacemaker/Heartbeat process was failed?
>   (In particular, for the failure of the new mcp process of pacemaker
> and for the current heartbeat's MCP process failure)

If the MCP dies, so will the crmd and cib (and by extension,
everything else except the PE and LRMd).
The types of failures are well tested.

Failure of heartbeat also result in the same types of secondary
failures and recovery as we see now.

>
> 2) Would the daemons used with respawn directive such as hbagent(SNMP
> daemon) or pingd work as compatible?

Pingd will no longer exist in 1.1, if I've not removed it already.
It is completely replaced by ocf:pacemaker:ping

hbagent might have to be a bit smarter about what to do when the cib
isn't around, but otherwise it shouldn't be a problem.

>
> 3) After all, what would be the benefit for end users with the change?

For corosync based clusters its a clear win - we get a much more
reliable startup/shutdown sequence.
Its also far more obvious what is happening at each stage, one can
also stop all resources on a node without taking the node offline.

If we changed it for heartbeat, it would be mostly for consistency.

>   I feel like it's only adding some complexity to the operations and
> the diagnostics by the end users.
>
> I guess that I would only use "legacy on" on the heartbeat stack...

Correct.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to