Hello,
Now I'm posting the patch which is to implement on_fail="standby".
This patch is for pacemaker-dev(5383f371494e).
Its purpose is to move all resources away from the node
when a resource is failed on that.
This setting is for start or monitor operation, not for stop op.
And as far as I confirm, the loop which Andrew said doesn't appear.
Your comments and suggestions are really appreciated.
Best Regards,
Satomi TANIGUCHI
Satomi Taniguchi wrote:
Hi Andrew,
Andrew Beekhof wrote:
>
(snip)
>
> no, i'm indicating that you've underestimated the scope of the
problem
>
(snip)
Bugzilla #1601 is caused by moving healthy resource in STONITH
ordering, isn't it?
I changed nothing about STONITH action when I implemented
on_fail="standby".
On the failure of stop operation or when Sprit-Brain occurs,
I completely agree with that on_fail should be "fence".
But I consider about start or monitor operation's failure.
And on_fail="standby" is on the assumption that it is used only for
these operations.
Its purpose is not to move healthy resources before doing STONITH,
but to move all resources away from the node which a resouce is
failed.
And in any operation, Bugzilla#1601 doesn't occur because I changed
nothing about STONITH.
STONITH doesn't require to stop any resources.
The following is why I make much of start and monitor operations.
What I regard seriously are:
- 1)On a resource's failure, only the failed resource
and resources which are in the same group move from
the failed node.
-> At present, to move all resources (even if they are not
in the group or have no constraints) away from
the failed node automatically, on_fail setting of
not only stop but start and monitor has to be set
"fence" and the failure node has to be killed by STONITH.
- 2)(In connection with 1) When resources are moved away by failure
of start or monitor operation, they should be shutdown normally.
-> It sounds extremely normal, but it is impossible
if you accord with 1).
-> Of course, I know that I have to kill the failed node
immediately if stop operation's failure or Split-Brain
occurs.
- 3)Rebooting the failed node may lose the evidence of
the real cause of a failure
(nearly equal administrators can't analyse the failure).
-> This is as Keisuke-san wrote before.
It is a really serious matter in Enterprise services.
To solve the matters above, I implemented on_fail="standby".
If you have any other ideas to solve them, please let me know.
Just for reference, there is an example in attached files:
a resource group named "grpPostgreSQLDB" consists of
IPaddr("prmIpPostgreSQLDB") and pgsql("prmApPostgreSQLDB") is
working on node2.
(See: crm_mon_before.log)
I modified pgsql's stop function to always return $OCF_ERR_GENERIC.
When IPaddr resource failed, and its monitor's on_fail is
"standby", pgsql tried to stop but it failed.
(See: pe-warn-0.node2.gif)
Then STONITH was executed according to the setting of pgsql's stop
operation, on_fail="fence".
(See: pe-warn-1.node2.gif and pe-warn-0.node1.gif)
STONITH killed node2 pitilessly, and both resources of the group
moved to node1 peacefully.
(See: crm_mon_after.log)
Best Regards,
Satomi Taniguchi
Andrew Beekhof wrote:
On Aug 4, 2008, at 8:11 AM, Satomi Taniguchi wrote:
Hi Andrew,
Thank you for your opitions!
But I'm afraid that you've misunderstood my intentions...
no, i'm indicating that you've underestimated the scope of the
problem
Andrew Beekhof wrote:
(snip)
Two problems...
The first is that standby happens after the fencing event, so
it's not really doing anything to migrate the healthy resources.
In the graph, the object "stonith-1 stop 0 rh5node1" just means
"a plugin named stonith-1 on rh5node1 stops",
not "fencing event occurs".
For example, Node1 has two resource groups.
When a resource in one group is failed,
all resources in both groups stopped completely,
and stonith plugin on Node1 stopped.
After this, both resource group work on Node2.
I attacched a graph, cib.xml
and crm_mon's logs (before and after a resource broke down).
Please see them.
Stop RscZ -(depends on)-> Stop RscY -(depends on)-> Stonith
NodeX -(depends on)-> Stop RscZ -(depends on)-> ...
I just want to stop all resources without STONITH when monitor NG,
I don't want to change any actions when stop NG.
The setting on_fail="standby" is for start or monitor operation,
and
it is on condition that the setting of stop operation's on_fail
is "fence".
Then, STONITH is not executed when start or monitor is failed,
but it is executed when stop is failed.
So, if RscY's monitor operation is failed,
its stop operation doesn't depend on "Sonith NodeX".
And if it is failed to stop RscY,
NodeX is turned off by STONITH, and the loop above does not occur.
Best Regards,
Satomi Taniguchi
_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
------------------------------------------------------------------------
_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker
diff -urN pacemaker-dev.orig/crmd/te_actions.c pacemaker-dev/crmd/
te_actions.c
--- pacemaker-dev.orig/crmd/te_actions.c 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/crmd/te_actions.c 2008-09-24 12:26:54.000000000
+0900
@@ -161,6 +161,54 @@
return TRUE;
}
+static gboolean
+te_standby_node(crm_graph_t *graph, crm_action_t *action)
+{
+ const char *id = NULL;
+ const char *uuid = NULL;
+ const char *target = NULL;
+
+ char *attr_id = NULL;
+ int str_length = 2;
+ const char *attr_name = "standby";
+
+ id = ID(action->xml);
+ target = crm_element_value(action->xml, XML_LRM_ATTR_TARGET);
+ uuid = crm_element_value(action->xml, XML_LRM_ATTR_TARGET_UUID);
+
+ CRM_CHECK(id != NULL,
+ crm_log_xml_warn(action->xml, "BadAction");
+ return FALSE);
+ CRM_CHECK(uuid != NULL,
+ crm_log_xml_warn(action->xml, "BadAction");
+ return FALSE);
+ CRM_CHECK(target != NULL,
+ crm_log_xml_warn(action->xml, "BadAction");
+ return FALSE);
+
+ te_log_action(LOG_INFO,
+ "Executing standby operation (%s) on %s", id, target);
+
+ str_length += strlen(attr_name);
+ str_length += strlen(uuid);
+
+ crm_malloc0(attr_id, str_length);
+ sprintf(attr_id, "%s-%s", attr_name, uuid);
+
+ if (cib_ok > update_attr(fsa_cib_conn, cib_inhibit_notify,
+ XML_CIB_TAG_NODES, uuid, NULL, attr_id, attr_name, "on",
FALSE)) {
+ crm_err("Cannot standby %s: update_attr() call failed.",
target);
+ }
+ crm_free(attr_id);
+
+ crm_info("Skipping wait for %d", action->id);
+ action->confirmed = TRUE;
+ update_graph(graph, action);
+ trigger_graph();
+
+ return TRUE;
+}
+
static int get_target_rc(crm_action_t *action)
{
const char *target_rc_s = g_hash_table_lookup(
@@ -471,7 +519,8 @@
te_pseudo_action,
te_rsc_command,
te_crm_command,
- te_fence_node
+ te_fence_node,
+ te_standby_node
};
void
diff -urN pacemaker-dev.orig/include/crm/crm.h pacemaker-dev/include/
crm/crm.h
--- pacemaker-dev.orig/include/crm/crm.h 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/include/crm/crm.h 2008-09-24 12:26:54.000000000
+0900
@@ -143,6 +143,7 @@
#define CRM_OP_SHUTDOWN_REQ "req_shutdown"
#define CRM_OP_SHUTDOWN "do_shutdown"
#define CRM_OP_FENCE "stonith"
+#define CRM_OP_STANDBY "standby"
#define CRM_OP_EVENTCC "event_cc"
#define CRM_OP_TEABORT "te_abort"
#define CRM_OP_TEABORTED "te_abort_confirmed" /* we asked */
diff -urN pacemaker-dev.orig/include/crm/pengine/common.h pacemaker-
dev/include/crm/pengine/common.h
--- pacemaker-dev.orig/include/crm/pengine/common.h 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/include/crm/pengine/common.h 2008-09-24
12:26:54.000000000 +0900
@@ -33,6 +33,7 @@
action_fail_migrate, /* recover by moving it somewhere else */
action_fail_block,
action_fail_stop,
+ action_fail_standby,
action_fail_fence
};
@@ -51,6 +52,7 @@
action_demote,
action_demoted,
shutdown_crm,
+ standby_node,
stonith_node
};
diff -urN pacemaker-dev.orig/include/crm/pengine/status.h pacemaker-
dev/include/crm/pengine/status.h
--- pacemaker-dev.orig/include/crm/pengine/status.h 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/include/crm/pengine/status.h 2008-09-24
12:26:54.000000000 +0900
@@ -107,6 +107,7 @@
gboolean standby;
gboolean pending;
gboolean unclean;
+ gboolean action_standby;
gboolean shutdown;
gboolean expected_up;
gboolean is_dc;
diff -urN pacemaker-dev.orig/include/crm/transition.h pacemaker-dev/
include/crm/transition.h
--- pacemaker-dev.orig/include/crm/transition.h 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/include/crm/transition.h 2008-09-24
12:26:54.000000000 +0900
@@ -113,6 +113,7 @@
gboolean (*rsc)(crm_graph_t *graph, crm_action_t *action);
gboolean (*crmd)(crm_graph_t *graph, crm_action_t *action);
gboolean (*stonith)(crm_graph_t *graph, crm_action_t *action);
+ gboolean (*standby)(crm_graph_t *graph, crm_action_t *action);
} crm_graph_functions_t;
enum transition_status {
diff -urN pacemaker-dev.orig/lib/pengine/common.c pacemaker-dev/lib/
pengine/common.c
--- pacemaker-dev.orig/lib/pengine/common.c 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/lib/pengine/common.c 2008-09-24 12:26:54.000000000
+0900
@@ -154,6 +154,9 @@
case action_fail_fence:
result = "fence";
break;
+ case action_fail_standby:
+ result = "standby";
+ break;
}
return result;
}
@@ -175,6 +178,8 @@
return shutdown_crm;
} else if(safe_str_eq(task, CRM_OP_FENCE)) {
return stonith_node;
+ } else if(safe_str_eq(task, CRM_OP_STANDBY)) {
+ return standby_node;
} else if(safe_str_eq(task, CRMD_ACTION_STATUS)) {
return monitor_rsc;
} else if(safe_str_eq(task, CRMD_ACTION_NOTIFY)) {
@@ -242,6 +247,9 @@
case stonith_node:
result = CRM_OP_FENCE;
break;
+ case standby_node:
+ result = CRM_OP_STANDBY;
+ break;
case monitor_rsc:
result = CRMD_ACTION_STATUS;
break;
diff -urN pacemaker-dev.orig/lib/pengine/unpack.c pacemaker-dev/lib/
pengine/unpack.c
--- pacemaker-dev.orig/lib/pengine/unpack.c 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/lib/pengine/unpack.c 2008-09-24 12:26:54.000000000
+0900
@@ -244,6 +244,7 @@
*/
new_node->details->unclean = TRUE;
}
+ new_node->details->action_standby = FALSE;
if(type == NULL
|| safe_str_eq(type, "member")
@@ -811,6 +812,10 @@
node->details->unclean = TRUE;
stop_action(rsc, node, FALSE);
+ } else if(on_fail == action_fail_standby) {
+ node->details->action_standby = TRUE;
+ stop_action(rsc, node, FALSE);
+
} else if(on_fail == action_fail_block) {
/* is_managed == FALSE will prevent any
* actions being sent for the resource
diff -urN pacemaker-dev.orig/lib/pengine/utils.c pacemaker-dev/lib/
pengine/utils.c
--- pacemaker-dev.orig/lib/pengine/utils.c 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/lib/pengine/utils.c 2008-09-24 12:26:54.000000000
+0900
@@ -707,6 +707,10 @@
value = "stop resource";
}
+ } else if(safe_str_eq(value, "standby")) {
+ action->on_fail = action_fail_standby;
+ value = "node fencing (standby)";
+
} else if(safe_str_eq(value, "ignore")
|| safe_str_eq(value, "nothing")) {
action->on_fail = action_fail_ignore;
diff -urN pacemaker-dev.orig/lib/transition/graph.c pacemaker-dev/
lib/transition/graph.c
--- pacemaker-dev.orig/lib/transition/graph.c 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/lib/transition/graph.c 2008-09-24
12:26:54.000000000 +0900
@@ -188,6 +188,11 @@
crm_debug_2("Executing STONITH-event: %d",
action->id);
return graph_fns->stonith(graph, action);
+
+ } else if(safe_str_eq(task, CRM_OP_STANDBY)) {
+ crm_debug_2("Executing STANDBY-event: %d",
+ action->id);
+ return graph_fns->standby(graph, action);
}
crm_debug_2("Executing crm-event: %d", action->id);
diff -urN pacemaker-dev.orig/lib/transition/utils.c pacemaker-dev/
lib/transition/utils.c
--- pacemaker-dev.orig/lib/transition/utils.c 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/lib/transition/utils.c 2008-09-24
12:26:54.000000000 +0900
@@ -41,6 +41,7 @@
pseudo_action_dummy,
pseudo_action_dummy,
pseudo_action_dummy,
+ pseudo_action_dummy,
pseudo_action_dummy
};
@@ -61,6 +62,7 @@
CRM_ASSERT(graph_fns->crmd != NULL);
CRM_ASSERT(graph_fns->pseudo != NULL);
CRM_ASSERT(graph_fns->stonith != NULL);
+ CRM_ASSERT(graph_fns->standby != NULL);
}
const char *
diff -urN pacemaker-dev.orig/pengine/allocate.c pacemaker-dev/
pengine/allocate.c
--- pacemaker-dev.orig/pengine/allocate.c 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/pengine/allocate.c 2008-09-24 12:26:54.000000000
+0900
@@ -777,6 +777,14 @@
last_stonith = stonith_op;
}
+ } else if(node->details->online &&
node->details->action_standby) {
+ action_t *standby_op = NULL;
+
+ standby_op = custom_action(
+ NULL, crm_strdup(CRM_OP_STANDBY),
+ CRM_OP_STANDBY, node, FALSE, TRUE, data_set);
+ standby_constraints(node, standby_op, data_set);
+
} else if(node->details->online && node->details->shutdown) {
action_t *down_op = NULL;
crm_info("Scheduling Node %s for shutdown",
diff -urN pacemaker-dev.orig/pengine/graph.c pacemaker-dev/pengine/
graph.c
--- pacemaker-dev.orig/pengine/graph.c 2008-09-24 11:05:09.000000000
+0900
+++ pacemaker-dev/pengine/graph.c 2008-09-24 12:26:54.000000000 +0900
@@ -347,6 +347,29 @@
return TRUE;
}
+gboolean
+standby_constraints(
+ node_t *node, action_t *standby_op, pe_working_set_t *data_set)
+{
+ /* add the stop to the before lists so it counts as a pre-req
+ * for the standby
+ */
+ slist_iter(
+ rsc, resource_t, node->details->running_rsc, lpc,
+
+ if(is_not_set(rsc->flags, pe_rsc_managed)) {
+ continue;
+ }
+
+ custom_action_order(
+ rsc, stop_key(rsc), NULL,
+ NULL, crm_strdup(CRM_OP_STANDBY), standby_op,
+ pe_order_implies_left, data_set);
+ );
+
+ return TRUE;
+}
+
static void dup_attr(gpointer key, gpointer value, gpointer user_data)
{
g_hash_table_replace(user_data, crm_strdup(key), crm_strdup(value));
@@ -369,6 +392,9 @@
action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
/* needs_node_info = FALSE; */
+ } else if(safe_str_eq(action->task, CRM_OP_STANDBY)) {
+ action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
+
} else if(safe_str_eq(action->task, CRM_OP_SHUTDOWN)) {
action_xml = create_xml_node(NULL, XML_GRAPH_TAG_CRM_EVENT);
diff -urN pacemaker-dev.orig/pengine/group.c pacemaker-dev/pengine/
group.c
--- pacemaker-dev.orig/pengine/group.c 2008-09-24 11:05:09.000000000
+0900
+++ pacemaker-dev/pengine/group.c 2008-09-24 12:26:54.000000000 +0900
@@ -435,6 +435,7 @@
case action_notified:
case shutdown_crm:
case stonith_node:
+ case standby_node:
break;
case stop_rsc:
case stopped_rsc:
diff -urN pacemaker-dev.orig/pengine/pengine.h pacemaker-dev/pengine/
pengine.h
--- pacemaker-dev.orig/pengine/pengine.h 2008-09-24
11:05:09.000000000 +0900
+++ pacemaker-dev/pengine/pengine.h 2008-09-24 12:26:54.000000000
+0900
@@ -150,6 +150,9 @@
extern gboolean stonith_constraints(
node_t *node, action_t *stonith_op, pe_working_set_t *data_set);
+extern gboolean standby_constraints(
+ node_t *node, action_t *standby_op, pe_working_set_t *data_set);
+
extern int custom_action_order(
resource_t *lh_rsc, char *lh_task, action_t *lh_action,
resource_t *rh_rsc, char *rh_task, action_t *rh_action,
diff -urN pacemaker-dev.orig/pengine/utils.c pacemaker-dev/pengine/
utils.c
--- pacemaker-dev.orig/pengine/utils.c 2008-09-24 11:05:12.000000000
+0900
+++ pacemaker-dev/pengine/utils.c 2008-09-24 12:26:54.000000000 +0900
@@ -180,10 +180,13 @@
if(node->details->online == FALSE
|| node->details->shutdown
|| node->details->unclean
- || node->details->standby) {
- crm_debug_2("%s: online=%d, unclean=%d, standby=%d",
+ || node->details->standby
+ || node->details->action_standby) {
+ crm_debug_2("%s: online=%d, unclean=%d, standby=%d" \
+ ", action_standby=%d",
node->details->uname, node->details->online,
- node->details->unclean, node->details->standby);
+ node->details->unclean, node->details->standby,
+ node->details->action_standby);
return FALSE;
}
return TRUE;
@@ -337,6 +340,7 @@
case monitor_rsc:
case shutdown_crm:
case stonith_node:
+ case standby_node:
task = no_action;
break;
default:
@@ -429,6 +433,7 @@
switch(text2task(action->task)) {
case stonith_node:
+ case standby_node:
case shutdown_crm:
do_crm_log(log_level,
"%s%s%sAction %d: %s%s%s%s%s%s",
diff -urN pacemaker-dev.orig/xml/crm-1.0.dtd pacemaker-dev/xml/
crm-1.0.dtd
--- pacemaker-dev.orig/xml/crm-1.0.dtd 2008-09-24 11:05:12.000000000
+0900
+++ pacemaker-dev/xml/crm-1.0.dtd 2008-09-24 12:26:54.000000000 +0900
@@ -266,7 +266,7 @@
disabled (true|yes|1|false|no|0) 'false'
role (Master|Slave|Started|Stopped) 'Started'
prereq (nothing|quorum|fencing) #IMPLIED
- on_fail (ignore|block|stop|restart|fence)
#IMPLIED>
+ on_fail (ignore|block|stop|restart|fence|
standby) #IMPLIED>
<!--
Use this to emulate v1 type Heartbeat groups.
Defining a resource group is a quick way to make sure that the
resources:
diff -urN pacemaker-dev.orig/xml/crm-transitional.dtd pacemaker-dev/
xml/crm-transitional.dtd
--- pacemaker-dev.orig/xml/crm-transitional.dtd 2008-09-24
11:05:12.000000000 +0900
+++ pacemaker-dev/xml/crm-transitional.dtd 2008-09-24
12:26:54.000000000 +0900
@@ -272,7 +272,7 @@
disabled (true|yes|1|false|no|0) 'false'
role (Master|Slave|Started|Stopped) 'Started'
prereq (nothing|quorum|fencing) #IMPLIED
- on_fail (ignore|block|stop|restart|fence)
#IMPLIED>
+ on_fail (ignore|block|stop|restart|fence|
standby) #IMPLIED>
<!--
Use this to emulate v1 type Heartbeat groups.
Defining a resource group is a quick way to make sure that the
resources:
diff -urN pacemaker-dev.orig/xml/crm.dtd pacemaker-dev/xml/crm.dtd
--- pacemaker-dev.orig/xml/crm.dtd 2008-09-24 11:05:12.000000000 +0900
+++ pacemaker-dev/xml/crm.dtd 2008-09-24 12:26:54.000000000 +0900
@@ -266,7 +266,7 @@
disabled (true|yes|1|false|no|0) 'false'
role (Master|Slave|Started|Stopped) 'Started'
prereq (nothing|quorum|fencing) #IMPLIED
- on_fail (ignore|block|stop|restart|fence)
#IMPLIED>
+ on_fail (ignore|block|stop|restart|fence|
standby) #IMPLIED>
<!--
Use this to emulate v1 type Heartbeat groups.
Defining a resource group is a quick way to make sure that the
resources:
diff -urN pacemaker-dev.orig/xml/resources.rng.in pacemaker-dev/xml/
resources.rng.in
--- pacemaker-dev.orig/xml/resources.rng.in 2008-09-24
11:05:12.000000000 +0900
+++ pacemaker-dev/xml/resources.rng.in 2008-09-24 12:26:54.000000000
+0900
@@ -160,6 +160,7 @@
<value>block</value>
<value>stop</value>
<value>restart</value>
+ <value>standby</value>
<value>fence</value>
</choice>
</attribute>
_______________________________________________
Pacemaker mailing list
Pacemaker@clusterlabs.org
http://list.clusterlabs.org/mailman/listinfo/pacemaker