Re: [Pacemaker] RFC: What part of the XML configuration do you hate the most?

Andrew Beekhof Wed, 24 Sep 2008 03:28:13 -0700

On a technical level, the use of inhibit_notify means that the clusterwont even act on the standby action until something else happens toinvoke the PE again.


There is no need to even have a standby action... one can simply do:


+               } else if(on_fail == action_fail_standby) {
+                       node->details->standby = TRUE;
+

in process_rsc_state() and it would take effect immediately - makingmost of the patch redundant.

I still think its strange that you'd want to migrate away allresources because an unrelated one failed... but its your cluster.

I'll apply a modified version of this patch today.


On Sep 24, 2008, at 10:34 AM, Satomi TANIGUCHI wrote:

Hello,

Now I'm posting the patch which is to implement on_fail="standby".
This patch is for pacemaker-dev(5383f371494e).

Its purpose is to move all resources away from the node
when a resource is failed on that.
This setting is for start or monitor operation, not for stop op.
And as far as I confirm, the loop which Andrew said doesn't appear.

Your comments and suggestions are really appreciated.


Best Regards,
Satomi TANIGUCHI




Satomi Taniguchi wrote:

Hi Andrew,
Andrew Beekhof wrote:
>
(snip)
>

> no, i'm indicating that you've underestimated the scope of theproblem

>
(snip)

Bugzilla #1601 is caused by moving healthy resource in STONITHordering, isn't it?I changed nothing about STONITH action when I implementedon_fail="standby".

On the failure of stop operation or when Sprit-Brain occurs,
I completely agree with that on_fail should be "fence".
But I consider about start or monitor operation's failure.

And on_fail="standby" is on the assumption that it is used only forthese operations.

Its purpose is not to move healthy resources before doing STONITH,

but to move all resources away from the node which a resouce isfailed.And in any operation, Bugzilla#1601 doesn't occur because I changednothing about STONITH.

STONITH doesn't require to stop any resources.
The following is why I make much of start and monitor operations.
What I regard seriously are:
 - 1)On a resource's failure, only the failed resource
     and resources which are in the same group move from
     the failed node.
     -> At present, to move all resources (even if they are not
        in the group or have no constraints) away from
        the failed node automatically, on_fail setting of
        not only stop but start and monitor has to be set
        "fence" and the failure node has to be killed by STONITH.
 - 2)(In connection with 1) When resources are moved away by failure
     of start or monitor operation, they should be shutdown normally.
     -> It sounds extremely normal, but it is impossible
        if you accord with 1).
     -> Of course, I know that I have to kill the failed node

immediately if stop operation's failure or Split-Brainoccurs.

 - 3)Rebooting the failed node may lose the evidence of
     the real cause of a failure
     (nearly equal administrators can't analyse the failure).
     -> This is as Keisuke-san wrote before.
        It is a really serious matter in Enterprise services.
To solve the matters above, I implemented on_fail="standby".
If you have any other ideas to solve them, please let me know.
Just for reference, there is an example in attached files:

a resource group named "grpPostgreSQLDB" consists ofIPaddr("prmIpPostgreSQLDB") and pgsql("prmApPostgreSQLDB") isworking on node2.

(See: crm_mon_before.log)
I modified pgsql's stop function to always return $OCF_ERR_GENERIC.

When IPaddr resource failed, and its monitor's on_fail is"standby", pgsql tried to stop but it failed.

(See: pe-warn-0.node2.gif)

Then STONITH was executed according to the setting of pgsql's stopoperation, on_fail="fence".

(See: pe-warn-1.node2.gif and pe-warn-0.node1.gif)

STONITH killed node2 pitilessly, and both resources of the groupmoved to node1 peacefully.

(See: crm_mon_after.log)
Best Regards,
Satomi Taniguchi
Andrew Beekhof wrote:


On Aug 4, 2008, at 8:11 AM, Satomi Taniguchi wrote:

Hi Andrew,

Thank you for your opitions!
But I'm afraid that you've misunderstood my intentions...

no, i'm indicating that you've underestimated the scope of theproblem



Andrew Beekhof wrote:
(snip)

Two problems...
The first is that standby happens after the fencing event, soit's not really doing anything to migrate the healthy resources.


In the graph, the object "stonith-1 stop 0 rh5node1" just means
"a plugin named stonith-1 on rh5node1 stops",
not "fencing event occurs".

For example, Node1 has two resource groups.
When a resource in one group is failed,
all resources in both groups stopped completely,
and stonith plugin on Node1 stopped.
After this, both resource group work on Node2.
I attacched a graph, cib.xml
and crm_mon's logs (before and after a resource broke down).
Please see them.

Stop RscZ -(depends on)-> Stop RscY -(depends on)-> StonithNodeX -(depends on)-> Stop RscZ -(depends on)-> ...

I just want to stop all resources without STONITH when monitor NG,
I don't want to change any actions when stop NG.

The setting on_fail="standby" is for start or monitor operation,andit is on condition that the setting of stop operation's on_failis "fence".

Then, STONITH is not executed when start or monitor is failed,
but it is executed when stop is failed.

So, if RscY's monitor operation is failed,
its stop operation doesn't depend on "Sonith NodeX".
And if it is failed to stop RscY,
NodeX is turned off by STONITH, and the loop above does not occur.


Best Regards,
Satomi Taniguchi



_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker



_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

------------------------------------------------------------------------
_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

diff -urN pacemaker-dev.orig/crmd/te_actions.c pacemaker-dev/crmd/te_actions.c--- pacemaker-dev.orig/crmd/te_actions.c 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/crmd/te_actions.c 2008-09-24 12:26:54.000000000+0900

@@ -161,6 +161,54 @@
        return TRUE;
}

+static gboolean
+te_standby_node(crm_graph_t *graph, crm_action_t *action)
+{
+       const char *id = NULL;
+       const char *uuid = NULL;
+       const char *target = NULL;
+
+       char *attr_id = NULL;
+       int str_length = 2;
+       const char *attr_name = "standby";
+
+       id = ID(action->xml);
+       target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
+       uuid = crm_element_value(action->xml, XML_LRM_ATTR_TARGET_UUID);
+
+       CRM_CHECK(id != NULL,
+                 crm_log_xml_warn(action->xml, "BadAction");
+                 return FALSE);
+       CRM_CHECK(uuid != NULL,
+                 crm_log_xml_warn(action->xml, "BadAction");
+                 return FALSE);
+       CRM_CHECK(target != NULL,
+                 crm_log_xml_warn(action->xml, "BadAction");
+                 return FALSE);
+
+       te_log_action(LOG_INFO,
+                     "Executing standby operation (%s) on %s", id, target);
+
+       str_length += strlen(attr_name);
+       str_length += strlen(uuid);
+
+       crm_malloc0(attr_id, str_length);
+       sprintf(attr_id, "%s-%s", attr_name, uuid);
+
+       if (cib_ok > update_attr(fsa_cib_conn, cib_inhibit_notify,
+               XML_CIB_TAG_NODES, uuid, NULL, attr_id, attr_name, "on", 
FALSE)) {
+               crm_err("Cannot standby %s: update_attr() call failed.", 
target);
+       }
+       crm_free(attr_id);
+
+       crm_info("Skipping wait for %d", action->id);
+       action->confirmed = TRUE;
+       update_graph(graph, action);
+       trigger_graph();
+
+       return TRUE;
+}
+
static int get_target_rc(crm_action_t *action)
{
        const char *target_rc_s = g_hash_table_lookup(
@@ -471,7 +519,8 @@
        te_pseudo_action,
        te_rsc_command,
        te_crm_command,
-       te_fence_node
+       te_fence_node,
+       te_standby_node
};

void

diff -urN pacemaker-dev.orig/include/crm/crm.h pacemaker-dev/include/crm/crm.h--- pacemaker-dev.orig/include/crm/crm.h 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/include/crm/crm.h 2008-09-24 12:26:54.000000000+0900

@@ -143,6 +143,7 @@
#define CRM_OP_SHUTDOWN_REQ     "req_shutdown"
#define CRM_OP_SHUTDOWN         "do_shutdown"
#define CRM_OP_FENCE            "stonith"
+#define CRM_OP_STANDBY         "standby"
#define CRM_OP_EVENTCC          "event_cc"
#define CRM_OP_TEABORT          "te_abort"
#define CRM_OP_TEABORTED        "te_abort_confirmed" /* we asked */

diff -urN pacemaker-dev.orig/include/crm/pengine/common.h pacemaker-dev/include/crm/pengine/common.h--- pacemaker-dev.orig/include/crm/pengine/common.h 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/include/crm/pengine/common.h 2008-09-2412:26:54.000000000 +0900

@@ -33,6 +33,7 @@
        action_fail_migrate,    /* recover by moving it somewhere else */
        action_fail_block,
        action_fail_stop,
+       action_fail_standby,
        action_fail_fence
};

@@ -51,6 +52,7 @@
        action_demote,
        action_demoted,
        shutdown_crm,
+       standby_node,
        stonith_node
};

diff -urN pacemaker-dev.orig/include/crm/pengine/status.h pacemaker-dev/include/crm/pengine/status.h--- pacemaker-dev.orig/include/crm/pengine/status.h 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/include/crm/pengine/status.h 2008-09-2412:26:54.000000000 +0900

@@ -107,6 +107,7 @@
                gboolean standby;
                gboolean pending;
                gboolean unclean;
+               gboolean action_standby;
                gboolean shutdown;
                gboolean expected_up;
                gboolean is_dc;

diff -urN pacemaker-dev.orig/include/crm/transition.h pacemaker-dev/include/crm/transition.h--- pacemaker-dev.orig/include/crm/transition.h 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/include/crm/transition.h 2008-09-2412:26:54.000000000 +0900

@@ -113,6 +113,7 @@
                gboolean (*rsc)(crm_graph_t *graph, crm_action_t *action);
                gboolean (*crmd)(crm_graph_t *graph, crm_action_t *action);
                gboolean (*stonith)(crm_graph_t *graph, crm_action_t *action);
+               gboolean (*standby)(crm_graph_t *graph, crm_action_t *action);
} crm_graph_functions_t;

enum transition_status {

diff -urN pacemaker-dev.orig/lib/pengine/common.c pacemaker-dev/lib/pengine/common.c--- pacemaker-dev.orig/lib/pengine/common.c 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/lib/pengine/common.c 2008-09-24 12:26:54.000000000+0900

@@ -154,6 +154,9 @@
                case action_fail_fence:
                        result = "fence";
                        break;
+               case action_fail_standby:
+                       result = "standby";
+                       break;
        }
        return result;
}
@@ -175,6 +178,8 @@
                return shutdown_crm;
        } else if(safe_str_eq(task, CRM_OP_FENCE)) {
                return stonith_node;
+       } else if(safe_str_eq(task, CRM_OP_STANDBY)) {
+               return standby_node;
        } else if(safe_str_eq(task, CRMD_ACTION_STATUS)) {
                return monitor_rsc;
        } else if(safe_str_eq(task, CRMD_ACTION_NOTIFY)) {
@@ -242,6 +247,9 @@
                case stonith_node:
                        result = CRM_OP_FENCE;
                        break;
+               case standby_node:
+                       result = CRM_OP_STANDBY;
+                       break;
                case monitor_rsc:
                        result = CRMD_ACTION_STATUS;
                        break;

diff -urN pacemaker-dev.orig/lib/pengine/unpack.c pacemaker-dev/lib/pengine/unpack.c--- pacemaker-dev.orig/lib/pengine/unpack.c 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/lib/pengine/unpack.c 2008-09-24 12:26:54.000000000+0900

@@ -244,6 +244,7 @@
                         */
                        new_node->details->unclean = TRUE;
                }
+               new_node->details->action_standby = FALSE;
                
                if(type == NULL
                   || safe_str_eq(type, "member")
@@ -811,6 +812,10 @@
                        node->details->unclean = TRUE;
                        stop_action(rsc, node, FALSE);
                                
+               } else if(on_fail == action_fail_standby) {
+                       node->details->action_standby = TRUE;
+                       stop_action(rsc, node, FALSE);
+
                } else if(on_fail == action_fail_block) {
                        /* is_managed == FALSE will prevent any
                         * actions being sent for the resource

diff -urN pacemaker-dev.orig/lib/pengine/utils.c pacemaker-dev/lib/pengine/utils.c--- pacemaker-dev.orig/lib/pengine/utils.c 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/lib/pengine/utils.c 2008-09-24 12:26:54.000000000+0900

@@ -707,6 +707,10 @@
                    value = "stop resource";
                }
                
+       } else if(safe_str_eq(value, "standby")) {
+               action->on_fail = action_fail_standby;
+               value = "node fencing (standby)";
+
        } else if(safe_str_eq(value, "ignore")
                || safe_str_eq(value, "nothing")) {
                action->on_fail = action_fail_ignore;

diff -urN pacemaker-dev.orig/lib/transition/graph.c pacemaker-dev/lib/transition/graph.c--- pacemaker-dev.orig/lib/transition/graph.c 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/lib/transition/graph.c 2008-09-2412:26:54.000000000 +0900

@@ -188,6 +188,11 @@
                        crm_debug_2("Executing STONITH-event: %d",
                                      action->id);
                        return graph_fns->stonith(graph, action);
+
+               } else if(safe_str_eq(task, CRM_OP_STANDBY)) {
+                       crm_debug_2("Executing STANDBY-event: %d",
+                                     action->id);
+                       return graph_fns->standby(graph, action);
                }
                
                crm_debug_2("Executing crm-event: %d", action->id);

diff -urN pacemaker-dev.orig/lib/transition/utils.c pacemaker-dev/lib/transition/utils.c--- pacemaker-dev.orig/lib/transition/utils.c 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/lib/transition/utils.c 2008-09-2412:26:54.000000000 +0900

@@ -41,6 +41,7 @@
        pseudo_action_dummy,
        pseudo_action_dummy,
        pseudo_action_dummy,
+       pseudo_action_dummy,
        pseudo_action_dummy
};

@@ -61,6 +62,7 @@
        CRM_ASSERT(graph_fns->crmd != NULL);
        CRM_ASSERT(graph_fns->pseudo != NULL);
        CRM_ASSERT(graph_fns->stonith != NULL);
+       CRM_ASSERT(graph_fns->standby != NULL);
}

const char *

diff -urN pacemaker-dev.orig/pengine/allocate.c pacemaker-dev/pengine/allocate.c--- pacemaker-dev.orig/pengine/allocate.c 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/pengine/allocate.c 2008-09-24 12:26:54.000000000+0900

@@ -777,6 +777,14 @@
                                last_stonith = stonith_op;                      
                        }

+               } else if(node->details->online && 
node->details->action_standby) {
+                       action_t *standby_op = NULL;
+
+                       standby_op = custom_action(
+                               NULL, crm_strdup(CRM_OP_STANDBY),
+                               CRM_OP_STANDBY, node, FALSE, TRUE, data_set);
+                       standby_constraints(node, standby_op, data_set);
+
                } else if(node->details->online && node->details->shutdown) {   
                    
                        action_t *down_op = NULL;       
                        crm_info("Scheduling Node %s for shutdown",

diff -urN pacemaker-dev.orig/pengine/graph.c pacemaker-dev/pengine/graph.c--- pacemaker-dev.orig/pengine/graph.c 2008-09-24 11:05:09.000000000+0900

+++ pacemaker-dev/pengine/graph.c       2008-09-24 12:26:54.000000000 +0900
@@ -347,6 +347,29 @@
        return TRUE;
}

+gboolean
+standby_constraints(
+       node_t *node, action_t *standby_op, pe_working_set_t *data_set)
+{
+       /* add the stop to the before lists so it counts as a pre-req
+        * for the standby
+        */
+       slist_iter(
+               rsc, resource_t, node->details->running_rsc, lpc,
+
+               if(is_not_set(rsc->flags, pe_rsc_managed)) {
+                       continue;
+               }
+
+               custom_action_order(
+                       rsc, stop_key(rsc), NULL,
+                       NULL, crm_strdup(CRM_OP_STANDBY), standby_op,
+                       pe_order_implies_left, data_set);
+       );
+
+       return TRUE;
+}
+
static void dup_attr(gpointer key, gpointer value, gpointer user_data)
{
        g_hash_table_replace(user_data, crm_strdup(key), crm_strdup(value));
@@ -369,6 +392,9 @@
                action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
/*              needs_node_info = FALSE; */
                
+       } else if(safe_str_eq(action->task, CRM_OP_STANDBY)) {
+               action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
+
        } else if(safe_str_eq(action->task, CRM_OP_SHUTDOWN)) {
                action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);

diff -urN pacemaker-dev.orig/pengine/group.c pacemaker-dev/pengine/group.c--- pacemaker-dev.orig/pengine/group.c 2008-09-24 11:05:09.000000000+0900

+++ pacemaker-dev/pengine/group.c       2008-09-24 12:26:54.000000000 +0900
@@ -435,6 +435,7 @@
                case action_notified:
                case shutdown_crm:
                case stonith_node:
+               case standby_node:
                    break;
                case stop_rsc:
                case stopped_rsc:

diff -urN pacemaker-dev.orig/pengine/pengine.h pacemaker-dev/pengine/pengine.h--- pacemaker-dev.orig/pengine/pengine.h 2008-09-2411:05:09.000000000 +0900+++ pacemaker-dev/pengine/pengine.h 2008-09-24 12:26:54.000000000+0900

@@ -150,6 +150,9 @@
extern gboolean stonith_constraints(
        node_t *node, action_t *stonith_op, pe_working_set_t *data_set);

+extern gboolean standby_constraints(
+       node_t *node, action_t *standby_op, pe_working_set_t *data_set);
+
extern int custom_action_order(
        resource_t *lh_rsc, char *lh_task, action_t *lh_action,
        resource_t *rh_rsc, char *rh_task, action_t *rh_action,

diff -urN pacemaker-dev.orig/pengine/utils.c pacemaker-dev/pengine/utils.c--- pacemaker-dev.orig/pengine/utils.c 2008-09-24 11:05:12.000000000+0900

+++ pacemaker-dev/pengine/utils.c       2008-09-24 12:26:54.000000000 +0900
@@ -180,10 +180,13 @@
        if(node->details->online == FALSE
           || node->details->shutdown
           || node->details->unclean
-          || node->details->standby) {
-               crm_debug_2("%s: online=%d, unclean=%d, standby=%d",
+          || node->details->standby
+          || node->details->action_standby) {
+               crm_debug_2("%s: online=%d, unclean=%d, standby=%d" \
+                           ", action_standby=%d",
                            node->details->uname, node->details->online,
-                           node->details->unclean, node->details->standby);
+                           node->details->unclean, node->details->standby,
+                           node->details->action_standby);
                return FALSE;
        }
        return TRUE;
@@ -337,6 +340,7 @@
                case monitor_rsc:
                case shutdown_crm:
                case stonith_node:
+               case standby_node:
                        task = no_action;
                        break;
                default:
@@ -429,6 +433,7 @@
        
        switch(text2task(action->task)) {
                case stonith_node:
+               case standby_node:
                case shutdown_crm:
                        do_crm_log(log_level,
                                      "%s%s%sAction %d: %s%s%s%s%s%s",

diff -urN pacemaker-dev.orig/xml/crm-1.0.dtd pacemaker-dev/xml/crm-1.0.dtd--- pacemaker-dev.orig/xml/crm-1.0.dtd 2008-09-24 11:05:12.000000000+0900

+++ pacemaker-dev/xml/crm-1.0.dtd       2008-09-24 12:26:54.000000000 +0900
@@ -266,7 +266,7 @@
          disabled      (true|yes|1|false|no|0)        'false'
          role          (Master|Slave|Started|Stopped) 'Started'
          prereq        (nothing|quorum|fencing)       #IMPLIED

<!--
Use this to emulate v1 type Heartbeat groups.

Defining a resource group is a quick way to make sure that theresources:diff -urN pacemaker-dev.orig/xml/crm-transitional.dtd pacemaker-dev/xml/crm-transitional.dtd--- pacemaker-dev.orig/xml/crm-transitional.dtd 2008-09-2411:05:12.000000000 +0900+++ pacemaker-dev/xml/crm-transitional.dtd 2008-09-2412:26:54.000000000 +0900

@@ -272,7 +272,7 @@
          disabled      (true|yes|1|false|no|0)        'false'
          role          (Master|Slave|Started|Stopped) 'Started'
          prereq        (nothing|quorum|fencing)       #IMPLIED

<!--
Use this to emulate v1 type Heartbeat groups.

Defining a resource group is a quick way to make sure that theresources:

diff -urN pacemaker-dev.orig/xml/crm.dtd pacemaker-dev/xml/crm.dtd
--- pacemaker-dev.orig/xml/crm.dtd      2008-09-24 11:05:12.000000000 +0900
+++ pacemaker-dev/xml/crm.dtd   2008-09-24 12:26:54.000000000 +0900
@@ -266,7 +266,7 @@
          disabled      (true|yes|1|false|no|0)        'false'
          role          (Master|Slave|Started|Stopped) 'Started'
          prereq        (nothing|quorum|fencing)       #IMPLIED

<!--
Use this to emulate v1 type Heartbeat groups.

Defining a resource group is a quick way to make sure that theresources:diff -urN pacemaker-dev.orig/xml/resources.rng.in pacemaker-dev/xml/resources.rng.in--- pacemaker-dev.orig/xml/resources.rng.in 2008-09-2411:05:12.000000000 +0900+++ pacemaker-dev/xml/resources.rng.in 2008-09-24 12:26:54.000000000+0900

@@ -160,6 +160,7 @@
                  <value>block</value>
                  <value>stop</value>
                  <value>restart</value>
+                 <value>standby</value>
                  <value>fence</value>
                </choice>
              </attribute>

_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker



_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Re: [Pacemaker] RFC: What part of the XML configuration do you hate the most?

Reply via email to