The branch, master has been updated via 0858b11 ctdb-tests: Use ctdb_node_list_to_map() in tool stubs via 1ef1cfd ctdb-common: Move ctdb_node_list_to_map() to utilities via dd52d82 ctdb-daemon: Factor out new function ctdb_node_list_to_map() via ffbe0a6 ctdb-tools: Drop the recovery from "reloadnodes" via d340f30 ctdb-daemon: Don't delay reloading the nodes file via 85bd9a3 ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled via 13dc4a9 ctdb-tool: Update "reloadnodes" to disable recoveries via ee9619c ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES via 2ca484c ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() via 108db33 ctdb-recoverd: Add slightly more abstraction for disabling takeover runs via ec32d9b ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() via 281f7e8 ctdb-recoverd: Use a goto for do_recovery() failures via a2044c6 ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() via 55b2461 ctdb-recoverd: Add a new abstraction ctdb_op_disable() via ae9cd037 ctdb-daemon: Pass on consistent flag information to recovery daemon via 4b972bb ctdb-tests: Add "ctdb reloadnodes" test for "node remains deleted" via 181658f ctdb-tools: Fix spurious messages about deleted nodes being disconnected from b57c778 rpc_server: Coverity fix for CID 1273079
https://git.samba.org/?p=samba.git;a=shortlog;h=master - Log ----------------------------------------------------------------- commit 0858b11ff735b535bfeded346c87a0c245d902c7 Author: Martin Schwenke <mar...@meltin.net> Date: Sun Feb 22 06:37:41 2015 +1100 ctdb-tests: Use ctdb_node_list_to_map() in tool stubs Drop copy of old ctdb_control_nodemap(). Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> Autobuild-User(master): Amitay Isaacs <ami...@samba.org> Autobuild-Date(master): Tue Apr 7 10:20:41 CEST 2015 on sn-devel-104 commit 1ef1cfdc4d6b923357630451177fdcde1d616e87 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Feb 20 12:34:25 2015 +1100 ctdb-common: Move ctdb_node_list_to_map() to utilities Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit dd52d82c73b26a3fed6dfd4aaf7d51f576d019d9 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Feb 20 12:31:37 2015 +1100 ctdb-daemon: Factor out new function ctdb_node_list_to_map() Change ctdb_control_getnodemap() to use this. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit ffbe0a6def236f5d0b03d089a7fc3f060eb0e392 Author: Martin Schwenke <mar...@meltin.net> Date: Wed Feb 4 12:06:56 2015 +1100 ctdb-tools: Drop the recovery from "reloadnodes" A recovery is not required: when deleting a node it should already be disconnected and when adding a node it will also be disconnected. The new sanity checks in "reloadnodes" ensure that these assumptions are met. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit d340f308e76af53b04ae9b5432c4f6c84315303a Author: Martin Schwenke <mar...@meltin.net> Date: Tue Feb 10 15:43:03 2015 +1100 ctdb-daemon: Don't delay reloading the nodes file Presumably this was done to minimise the chance of a recovery occurring while the nodemaps are inconsistent across nodes. Another potential theory is that the forced recovery in the ctdb.c:control_reload_nodes_file() stops another recovery occurring for ReRecoveryTimeout seconds, so this delay causes the reloads to occur during that period. This is no longer necessary because recoveries are now explicitly disabled while node files are reloaded. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 85bd9a33eb65d6fd03ad85aeedf141a2813c2bb8 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Feb 6 20:59:11 2015 +1100 ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled The potential resulting recovery won't run anyway. Also recoveries may have been disabled by "reloadnodes" and if the nodemaps are inconsistent between nodes then avoid triggering an unnecessary recovery. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 13dc4a98426b30e7226015b1d8a86ec2e80d6228 Author: Martin Schwenke <mar...@meltin.net> Date: Mon Feb 9 20:20:44 2015 +1100 ctdb-tool: Update "reloadnodes" to disable recoveries If a recovery occurs when some nodes have reloaded and others haven't then the nodemaps with be inconsistent so bad things will happen. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit ee9619c28b594b7fec8093b522ac205e5d4eb0ea Author: Martin Schwenke <mar...@meltin.net> Date: Fri Feb 6 15:06:44 2015 +1100 ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES Also add test stub support. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 2ca484cd50c2655c59802cae6c81982b42bf61eb Author: Martin Schwenke <mar...@meltin.net> Date: Fri Feb 6 15:03:03 2015 +1100 ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable() Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 108db3396f71a35ef1690a5b483d2728223803df Author: Martin Schwenke <mar...@meltin.net> Date: Fri Feb 6 13:05:12 2015 +1100 ctdb-recoverd: Add slightly more abstraction for disabling takeover runs Factor out new function srvid_disable_and_reply(), which can be re-used. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit ec32d9bea8993778cd6b0fc63bfde492ee21d830 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Feb 6 14:47:33 2015 +1100 ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable() Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 281f7e8152e01a15e9df946ee293156ded8b2857 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Feb 6 14:32:08 2015 +1100 ctdb-recoverd: Use a goto for do_recovery() failures This will allow extra things to be done on failure. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit a2044c65bc669e7240bd4ffc4b6935f57f493535 Author: Martin Schwenke <mar...@meltin.net> Date: Sun Feb 8 20:52:12 2015 +1100 ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable() Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 55b246195b282175022ea2ae239ebcd5d4970d3f Author: Martin Schwenke <mar...@meltin.net> Date: Sun Feb 8 20:50:38 2015 +1100 ctdb-recoverd: Add a new abstraction ctdb_op_disable() This can be used to disable and re-enable an operation, and do all the relevant sanity checking. Most of this is from existing functions disable_takeover_runs_handler(), clear_takeover_runs_disable() and reenable_takeover_runs(). Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit ae9cd037ee96c000b11aaa7d171463b00fe4850c Author: Martin Schwenke <mar...@meltin.net> Date: Wed Feb 4 17:18:12 2015 +1100 ctdb-daemon: Pass on consistent flag information to recovery daemon Signed-off-by: Martin Schwenke <mar...@meltin.net> Pair-programmed-with: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 4b972bbdb3e2d3f35fad3c47dc6e84f0fee513c4 Author: Martin Schwenke <mar...@meltin.net> Date: Wed Apr 1 18:00:04 2015 +1100 ctdb-tests: Add "ctdb reloadnodes" test for "node remains deleted" Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 181658f5bb180c48f88504a703ed3a3758ac3b5b Author: Martin Schwenke <mar...@meltin.net> Date: Wed Apr 1 17:10:46 2015 +1100 ctdb-tools: Fix spurious messages about deleted nodes being disconnected The code was too "clever". The 4 different cases should be separate. The "node remains deleted" case doesn't need the IP address comparison (always 0.0.0.0) or the disconnected check. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> ----------------------------------------------------------------------- Summary of changes: ctdb/common/ctdb_util.c | 27 ++ ctdb/include/ctdb_private.h | 3 + ctdb/include/ctdb_protocol.h | 3 + ctdb/server/ctdb_monitor.c | 1 + ctdb/server/ctdb_recover.c | 47 +--- ctdb/server/ctdb_recoverd.c | 293 +++++++++++++-------- ctdb/tests/src/ctdb_test_stubs.c | 50 +--- ...eloadnodes.001.sh => stubby.reloadnodes.024.sh} | 9 +- ctdb/tools/ctdb.c | 31 ++- 9 files changed, 263 insertions(+), 201 deletions(-) copy ctdb/tests/tool/{stubby.reloadnodes.001.sh => stubby.reloadnodes.024.sh} (72%) Changeset truncated at 500 lines: diff --git a/ctdb/common/ctdb_util.c b/ctdb/common/ctdb_util.c index 76fb06d..8e2e430 100644 --- a/ctdb/common/ctdb_util.c +++ b/ctdb/common/ctdb_util.c @@ -579,6 +579,33 @@ struct ctdb_node_map *ctdb_read_nodes_file(TALLOC_CTX *mem_ctx, return ret; } +struct ctdb_node_map * +ctdb_node_list_to_map(struct ctdb_node **nodes, uint32_t num_nodes, + TALLOC_CTX *mem_ctx) +{ + uint32_t i; + size_t size; + struct ctdb_node_map *node_map; + + size = offsetof(struct ctdb_node_map, nodes) + + num_nodes * sizeof(struct ctdb_node_and_flags); + node_map = (struct ctdb_node_map *)talloc_zero_size(mem_ctx, size); + if (node_map == NULL) { + DEBUG(DEBUG_ERR, + (__location__ " Failed to allocate nodemap array\n")); + return NULL; + } + + node_map->num = num_nodes; + for (i=0; i<num_nodes; i++) { + node_map->nodes[i].addr = nodes[i]->address; + node_map->nodes[i].pnn = nodes[i]->pnn; + node_map->nodes[i].flags = nodes[i]->flags; + } + + return node_map; +} + const char *ctdb_eventscript_call_names[] = { "init", "setup", diff --git a/ctdb/include/ctdb_private.h b/ctdb/include/ctdb_private.h index b37d5bb..532f859 100644 --- a/ctdb/include/ctdb_private.h +++ b/ctdb/include/ctdb_private.h @@ -1388,6 +1388,9 @@ int ctdb_client_async_control(struct ctdb_context *ctdb, client_async_callback fail_callback, void *callback_data); +struct ctdb_node_map * +ctdb_node_list_to_map(struct ctdb_node **nodes, uint32_t num_nodes, + TALLOC_CTX *mem_ctx); struct ctdb_node_map *ctdb_read_nodes_file(TALLOC_CTX *mem_ctx, const char *nlist); void ctdb_load_nodes_file(struct ctdb_context *ctdb); diff --git a/ctdb/include/ctdb_protocol.h b/ctdb/include/ctdb_protocol.h index c828c01..4dea56b 100644 --- a/ctdb/include/ctdb_protocol.h +++ b/ctdb/include/ctdb_protocol.h @@ -156,6 +156,9 @@ struct ctdb_call_info { /* A message handler ID to stop takeover runs from occurring */ #define CTDB_SRVID_DISABLE_TAKEOVER_RUNS 0xFB03000000000000LL +/* A message handler ID to stop recoveries from occurring */ +#define CTDB_SRVID_DISABLE_RECOVERIES 0xFB04000000000000LL + /* A message id to ask the recovery daemon to temporarily disable the public ip checks */ diff --git a/ctdb/server/ctdb_monitor.c b/ctdb/server/ctdb_monitor.c index 9b8df6d..5c0c055 100644 --- a/ctdb/server/ctdb_monitor.c +++ b/ctdb/server/ctdb_monitor.c @@ -497,6 +497,7 @@ int32_t ctdb_control_modflags(struct ctdb_context *ctdb, TDB_DATA indata) } /* tell the recovery daemon something has changed */ + c->new_flags = node->flags; ctdb_daemon_send_message(ctdb, ctdb->pnn, CTDB_SRVID_SET_NODE_FLAGS, indata); diff --git a/ctdb/server/ctdb_recover.c b/ctdb/server/ctdb_recover.c index eb3f46d..7a684d5 100644 --- a/ctdb/server/ctdb_recover.c +++ b/ctdb/server/ctdb_recover.c @@ -118,30 +118,19 @@ ctdb_control_getdbmap(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA indat return 0; } -int +int ctdb_control_getnodemap(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA *outdata) { - uint32_t i, num_nodes; - struct ctdb_node_map *node_map; - CHECK_CONTROL_DATA_SIZE(0); - num_nodes = ctdb->num_nodes; - - outdata->dsize = offsetof(struct ctdb_node_map, nodes) + num_nodes*sizeof(struct ctdb_node_and_flags); - outdata->dptr = (unsigned char *)talloc_zero_size(outdata, outdata->dsize); - if (!outdata->dptr) { - DEBUG(DEBUG_ALERT, (__location__ " Failed to allocate nodemap array\n")); - exit(1); + outdata->dptr = (unsigned char *)ctdb_node_list_to_map(ctdb->nodes, + ctdb->num_nodes, + outdata); + if (outdata->dptr == NULL) { + return -1; } - node_map = (struct ctdb_node_map *)outdata->dptr; - node_map->num = num_nodes; - for (i=0; i<num_nodes; i++) { - node_map->nodes[i].addr = ctdb->nodes[i]->address; - node_map->nodes[i].pnn = ctdb->nodes[i]->pnn; - node_map->nodes[i].flags = ctdb->nodes[i]->flags; - } + outdata->dsize = talloc_get_size(outdata->dptr); return 0; } @@ -177,14 +166,15 @@ ctdb_control_getnodemapv4(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA i return 0; } -static void -ctdb_reload_nodes_event(struct event_context *ev, struct timed_event *te, - struct timeval t, void *private_data) +/* + reload the nodes file +*/ +int +ctdb_control_reload_nodes_file(struct ctdb_context *ctdb, uint32_t opcode) { int i, num_nodes; - struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context); TALLOC_CTX *tmp_ctx; - struct ctdb_node **nodes; + struct ctdb_node **nodes; tmp_ctx = talloc_new(ctdb); @@ -225,17 +215,6 @@ ctdb_reload_nodes_event(struct event_context *ev, struct timed_event *te, ctdb_daemon_send_message(ctdb, ctdb->pnn, CTDB_SRVID_RELOAD_NODES, tdb_null); talloc_free(tmp_ctx); - return; -} - -/* - reload the nodes file after a short delay (so that we can send the response - back first -*/ -int -ctdb_control_reload_nodes_file(struct ctdb_context *ctdb, uint32_t opcode) -{ - event_add_timed(ctdb->ev, ctdb, timeval_current_ofs(1,0), ctdb_reload_nodes_event, ctdb); return 0; } diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c index 99018be..673075a 100644 --- a/ctdb/server/ctdb_recoverd.c +++ b/ctdb/server/ctdb_recoverd.c @@ -117,6 +117,103 @@ nomem: srvid_request_reply(ctdb, request, result); } +/* An abstraction to allow an operation (takeover runs, recoveries, + * ...) to be disabled for a given timeout */ +struct ctdb_op_state { + struct tevent_timer *timer; + bool in_progress; + const char *name; +}; + +static struct ctdb_op_state *ctdb_op_init(TALLOC_CTX *mem_ctx, const char *name) +{ + struct ctdb_op_state *state = talloc_zero(mem_ctx, struct ctdb_op_state); + + if (state != NULL) { + state->in_progress = false; + state->name = name; + } + + return state; +} + +static bool ctdb_op_is_disabled(struct ctdb_op_state *state) +{ + return state->timer != NULL; +} + +static bool ctdb_op_begin(struct ctdb_op_state *state) +{ + if (ctdb_op_is_disabled(state)) { + DEBUG(DEBUG_NOTICE, + ("Unable to begin - %s are disabled\n", state->name)); + return false; + } + + state->in_progress = true; + return true; +} + +static bool ctdb_op_end(struct ctdb_op_state *state) +{ + return state->in_progress = false; +} + +static bool ctdb_op_is_in_progress(struct ctdb_op_state *state) +{ + return state->in_progress; +} + +static void ctdb_op_enable(struct ctdb_op_state *state) +{ + TALLOC_FREE(state->timer); +} + +static void ctdb_op_timeout_handler(struct event_context *ev, + struct timed_event *te, + struct timeval yt, void *p) +{ + struct ctdb_op_state *state = + talloc_get_type(p, struct ctdb_op_state); + + DEBUG(DEBUG_NOTICE,("Reenabling %s after timeout\n", state->name)); + ctdb_op_enable(state); +} + +static int ctdb_op_disable(struct ctdb_op_state *state, + struct tevent_context *ev, + uint32_t timeout) +{ + if (timeout == 0) { + DEBUG(DEBUG_NOTICE,("Reenabling %s\n", state->name)); + ctdb_op_enable(state); + return 0; + } + + if (state->in_progress) { + DEBUG(DEBUG_ERR, + ("Unable to disable %s - in progress\n", state->name)); + return -EAGAIN; + } + + DEBUG(DEBUG_NOTICE,("Disabling %s for %u seconds\n", + state->name, timeout)); + + /* Clear any old timers */ + talloc_free(state->timer); + + /* Arrange for the timeout to occur */ + state->timer = tevent_add_timer(ev, state, + timeval_current_ofs(timeout, 0), + ctdb_op_timeout_handler, state); + if (state->timer == NULL) { + DEBUG(DEBUG_ERR,(__location__ " Unable to setup timer\n")); + return -ENOMEM; + } + + return 0; +} + struct ctdb_banning_state { uint32_t count; struct timeval last_reported_time; @@ -141,8 +238,8 @@ struct ctdb_recoverd { struct timed_event *election_timeout; struct vacuum_info *vacuum_info; struct srvid_requests *reallocate_requests; - bool takeover_run_in_progress; - TALLOC_CTX *takeover_runs_disable_ctx; + struct ctdb_op_state *takeover_run; + struct ctdb_op_state *recovery; struct ctdb_control_get_ifaces *ifaces; uint32_t *force_rebalance_nodes; }; @@ -1566,7 +1663,7 @@ static int ctdb_reload_remote_public_ips(struct ctdb_context *ctdb, } if (ctdb->do_checkpublicip && - rec->takeover_runs_disable_ctx == NULL && + !ctdb_op_is_disabled(rec->takeover_run) && verify_remote_ip_allocation(ctdb, node->known_public_ips, node->pnn)) { @@ -1691,19 +1788,14 @@ static bool do_takeover_run(struct ctdb_recoverd *rec, DEBUG(DEBUG_NOTICE, ("Takeover run starting\n")); - if (rec->takeover_run_in_progress) { + if (ctdb_op_is_in_progress(rec->takeover_run)) { DEBUG(DEBUG_ERR, (__location__ " takeover run already in progress \n")); ok = false; goto done; } - rec->takeover_run_in_progress = true; - - /* If takeover runs are in disabled then fail... */ - if (rec->takeover_runs_disable_ctx != NULL) { - DEBUG(DEBUG_ERR, - ("Takeover runs are disabled so refusing to run one\n")); + if (!ctdb_op_begin(rec->takeover_run)) { ok = false; goto done; } @@ -1767,7 +1859,7 @@ static bool do_takeover_run(struct ctdb_recoverd *rec, done: rec->need_takeover_run = !ok; talloc_free(nodes); - rec->takeover_run_in_progress = false; + ctdb_op_end(rec->takeover_run); DEBUG(DEBUG_NOTICE, ("Takeover run %s\n", ok ? "completed successfully" : "unsuccessful")); return ok; @@ -1796,16 +1888,20 @@ static int do_recovery(struct ctdb_recoverd *rec, /* if recovery fails, force it again */ rec->need_recovery = true; + if (!ctdb_op_begin(rec->recovery)) { + return -1; + } + if (rec->election_timeout) { /* an election is in progress */ DEBUG(DEBUG_ERR, ("do_recovery called while election in progress - try again later\n")); - return -1; + goto fail; } ban_misbehaving_nodes(rec, &self_ban); if (self_ban) { DEBUG(DEBUG_NOTICE, ("This node was banned, aborting recovery\n")); - return -1; + goto fail; } if (ctdb->recovery_lock_file != NULL) { @@ -1823,14 +1919,14 @@ static int do_recovery(struct ctdb_recoverd *rec, */ DEBUG(DEBUG_ERR, ("Unable to get recovery lock" " - retrying recovery\n")); - return -1; + goto fail; } DEBUG(DEBUG_ERR,("Unable to get recovery lock - aborting recovery " "and ban ourself for %u seconds\n", ctdb->tunable.recovery_ban_period)); ctdb_ban_node(rec, pnn, ctdb->tunable.recovery_ban_period); - return -1; + goto fail; } ctdb_ctrl_report_recd_lock_latency(ctdb, CONTROL_TIMEOUT(), @@ -1846,7 +1942,7 @@ static int do_recovery(struct ctdb_recoverd *rec, ret = ctdb_ctrl_getdbmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, &dbmap); if (ret != 0) { DEBUG(DEBUG_ERR, (__location__ " Unable to get dbids from node :%u\n", pnn)); - return -1; + goto fail; } /* we do the db creation before we set the recovery mode, so the freeze happens @@ -1856,14 +1952,14 @@ static int do_recovery(struct ctdb_recoverd *rec, ret = create_missing_local_databases(ctdb, nodemap, pnn, &dbmap, mem_ctx); if (ret != 0) { DEBUG(DEBUG_ERR, (__location__ " Unable to create missing local databases\n")); - return -1; + goto fail; } /* verify that all other nodes have all our databases */ ret = create_missing_remote_databases(ctdb, nodemap, pnn, dbmap, mem_ctx); if (ret != 0) { DEBUG(DEBUG_ERR, (__location__ " Unable to create missing remote databases\n")); - return -1; + goto fail; } DEBUG(DEBUG_NOTICE, (__location__ " Recovery - created remote databases\n")); @@ -1884,14 +1980,14 @@ static int do_recovery(struct ctdb_recoverd *rec, ret = set_recovery_mode(ctdb, rec, nodemap, CTDB_RECOVERY_ACTIVE); if (ret != 0) { DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to active on cluster\n")); - return -1; + goto fail; } /* execute the "startrecovery" event script on all nodes */ ret = run_startrecovery_eventscript(rec, nodemap); if (ret!=0) { DEBUG(DEBUG_ERR, (__location__ " Unable to run the 'startrecovery' event on cluster\n")); - return -1; + goto fail; } /* @@ -1908,7 +2004,7 @@ static int do_recovery(struct ctdb_recoverd *rec, DEBUG(DEBUG_WARNING, (__location__ "Unable to update flags on inactive node %d\n", i)); } else { DEBUG(DEBUG_ERR, (__location__ " Unable to update flags on all nodes for node %d\n", i)); - return -1; + goto fail; } } } @@ -1932,7 +2028,7 @@ static int do_recovery(struct ctdb_recoverd *rec, ret = ctdb_ctrl_setvnnmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, vnnmap); if (ret != 0) { DEBUG(DEBUG_ERR, (__location__ " Unable to set vnnmap for node %u\n", pnn)); - return -1; + goto fail; } data.dptr = (void *)&generation; @@ -1954,7 +2050,7 @@ static int do_recovery(struct ctdb_recoverd *rec, NULL) != 0) { DEBUG(DEBUG_ERR,("Failed to cancel recovery transaction\n")); } - return -1; + goto fail; } DEBUG(DEBUG_NOTICE,(__location__ " started transactions on all nodes\n")); @@ -1966,7 +2062,7 @@ static int do_recovery(struct ctdb_recoverd *rec, pnn, nodemap, generation); if (ret != 0) { DEBUG(DEBUG_ERR, (__location__ " Failed to recover database 0x%x\n", dbmap->dbs[i].dbid)); - return -1; + goto fail; } } @@ -1979,7 +2075,7 @@ static int do_recovery(struct ctdb_recoverd *rec, NULL, NULL, NULL) != 0) { DEBUG(DEBUG_ERR, (__location__ " Unable to commit recovery changes. Recovery failed.\n")); - return -1; + goto fail; } DEBUG(DEBUG_NOTICE, (__location__ " Recovery - committed databases\n")); @@ -1989,7 +2085,7 @@ static int do_recovery(struct ctdb_recoverd *rec, ret = update_capabilities(ctdb, nodemap); if (ret!=0) { DEBUG(DEBUG_ERR, (__location__ " Unable to update node capabilities.\n")); - return -1; + goto fail; } /* build a new vnn map with all the currently active and @@ -2029,7 +2125,7 @@ static int do_recovery(struct ctdb_recoverd *rec, ret = update_vnnmap_on_all_nodes(ctdb, nodemap, pnn, vnnmap, mem_ctx); if (ret != 0) { DEBUG(DEBUG_ERR, (__location__ " Unable to update vnnmap on all nodes\n")); - return -1; + goto fail; } DEBUG(DEBUG_NOTICE, (__location__ " Recovery - updated vnnmap\n")); @@ -2038,7 +2134,7 @@ static int do_recovery(struct ctdb_recoverd *rec, ret = set_recovery_master(ctdb, nodemap, pnn); if (ret!=0) { DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery master\n")); - return -1; + goto fail; } DEBUG(DEBUG_NOTICE, (__location__ " Recovery - updated recmaster\n")); @@ -2047,7 +2143,7 @@ static int do_recovery(struct ctdb_recoverd *rec, ret = set_recovery_mode(ctdb, rec, nodemap, CTDB_RECOVERY_NORMAL); if (ret != 0) { DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to normal on cluster\n")); - return -1; + goto fail; } DEBUG(DEBUG_NOTICE, (__location__ " Recovery - disabled recovery mode\n")); @@ -2058,7 +2154,7 @@ static int do_recovery(struct ctdb_recoverd *rec, DEBUG(DEBUG_ERR,("Failed to read public ips from remote node %d\n", culprit)); rec->need_takeover_run = true; - return -1; + goto fail; } do_takeover_run(rec, nodemap, false); @@ -2067,7 +2163,7 @@ static int do_recovery(struct ctdb_recoverd *rec, ret = run_recovered_eventscript(rec, nodemap, "do_recovery"); if (ret!=0) { DEBUG(DEBUG_ERR, (__location__ " Unable to run the 'recovered' event on cluster. Recovery process failed.\n")); - return -1; + goto fail; } DEBUG(DEBUG_NOTICE, (__location__ " Recovery - finished the recovered event\n")); -- Samba Shared Repository