Hello Martin,
it is pure luck that I am so bored that I read this list, next time CC
me. :-)

> I have read several postings in the mail archive about the
> external/ipmi configuration but there are still some questions that
> bother me.  The last posting from Thomas: did this cib-configuration
> worked with your 2-node cluster?  I have to configure also 2 nodes and
> would like to use the ipmi-plugin but I am unsure if I understand what
> the plugin really does.

I have the following configuration on two systems and I verified that
this configuration works as it should be. Someone on this list told me
that I can drop the location constraints, however I decided to keep them
until I verified that.

<configuration>
        <crm_config>
                <cluster_property_set id="cib-bootstrap-options">
                        <attributes>
                                <nvpair name="stonith-enabled" value="true" 
id="stonith-enabled"/>
                                <nvpair name="stonith-action" value="reboot" 
id="stonith-action"/>
                        </attributes>
                </cluster_property_set>
        </crm_config>

        <resources>
                <primitive id="apache-01-fencing" class="stonith" 
type="external/ipmi" provider="heartbeat">
                        <operations>
                                <op id="apache-01-fencing-monitor" 
name="monitor" interval="60s" timeout="20s" prereq="nothing"/>
                                <op id="apache-01-fencing-start" name="start" 
timeout="20s" prereq="nothing"/>
                        </operations>

                        <instance_attributes id="ia-apache-01-fencing">
                                <attributes>
                                        <nvpair id="apache-01-fencing-hostname" 
name="hostname" value="apache-01"/>
                                        <nvpair id="apache-01-fencing-ipaddr" 
name="ipaddr" value="172.18.0.101"/>
                                        <nvpair id="apache-01-fencing-userid" 
name="userid" value="Administrator"/>
                                        <nvpair id="apache-01-fencing-passwd" 
name="passwd" value="whatever"/>
                                </attributes>
                        </instance_attributes>
                </primitive>

                <primitive id="apache-02-fencing" class="stonith" 
type="external/ipmi" provider="heartbeat">
                        <operations>
                                <op id="apache-02-fencing-monitor" 
name="monitor" interval="60s" timeout="20s" prereq="nothing"/>
                                <op id="apache-02-fencing-start" name="start" 
timeout="20s" prereq="nothing"/>
                        </operations>

                        <instance_attributes id="ia-apache-02-fencing">
                                <attributes>
                                        <nvpair id="apache-02-fencing-hostname" 
name="hostname" value="apache-02"/>
                                        <nvpair id="apache-02-fencing-ipaddr" 
name="ipaddr" value="172.18.0.102"/>
                                        <nvpair id="apache-02-fencing-userid" 
name="userid" value="Administrator"/>
                                        <nvpair id="apache-02-fencing-passwd" 
name="passwd" value="whatever"/>
                                </attributes>
                        </instance_attributes>
                </primitive>
        </resources>

        <constraints>
                <rsc_location id="apache-01-fencing-placement" 
rsc="apache-01-fencing">
                        <rule id="apache-01-fencing-placement-rule-1" 
score="-INFINITY">
                                <expression 
id="apache-01-fencing-placement-exp-02" value="apache-02" attribute="#uname" 
operation="ne"/>
                        </rule>
                </rsc_location>

                <rsc_location id="apache-02-fencing-placement" 
rsc="apache-02-fencing">
                        <rule id="apache-02-fencing-placement-rule-1" 
score="-INFINITY">
                                <expression 
id="apache-02-fencing-placement-exp-02" value="apache-01" attribute="#uname" 
operation="ne"/>
                        </rule>
                </rsc_location>
        </constraints>
</configuration>

I killed heartbeat with -9 to simulate a node failure.


> To configure the plugin, I will create a resource for every node. This
> means, two additional resources in my cib.xml because I have two
> cluster-nodes.

Correct.

> The attributes (nvpair) define variables for the ipmi-script, e.g.
> hostname...  But what does the constraints tell me? If "#uname" is not
> equal to"value" then the score ist -INFINITY, i.e. the resource will
> never be started on that node?

you pin ,,apache-01-fencing'' on apache-02 and ,,apache-02-fencing'' on
apache-01. So that the resource that can stonith apache-01 runs on apache-02
and vice versa. Someone stated that heartbeat is able to do a suicide
(stonith itself) but that isn't true at least not via stonith and not in
version 2.1.3. The location constraints seems to be unneccessary because
if the fencing is running on the wrong node and that node misbehaves, it
is restartet on the remainding node and than shoots the misbehaving.
However they don't hurt either. So I leave it exactly in the above
configuration because it proofed to work reliable in real live.

> In summary, how does the fencing work? The heartbeat-process gets
> killed on one node, so all resources of that node are failed over to
> the other node.  So far so good, but what role plays the new resource
> in the failover? Is this resource also failed over to the other node
> an therefore executes the ipmi-script with a "reset"-option to its
> "origin"-node?

Nope. The location constraints makes it impossible to fail over the
stonith resource, so it won't be started until the dead node returns to
live. But all other resources migrate over.

> I still do not know how this stonith-plugin works exactly, so I have
> some problems configuring heartbeat.  I hope someone can help me with
> these questions, because I want to understand how the STONITH works
> instead of just copy and paste some code-snippets.

If a node is misbehaving eg. stop talking to me without saying ,,bye''
it gets shot in the head to make sure it is really dead. After that the
resources that were running before on the other node migrate over.

        Thomas
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to