The branch, master has been updated via 30c40046ef0 ctdb-build: Add missing dependency on talloc via e831af7b257 ctdb-tests: Work around unreadable file test failure when root via b20ccaa36da ctdb-scripts: Use "git config" as last resort to parse nfs.conf via db37043bc5c ctdb-scripts: Avoid ShellCheck warning SC2295 via 00f1d6d9476 ctdb-common: Use POSIX if_nameindex() to check interface existence via b686bbb4ac3 replace: Add check for if_nameindex() via c77a4fde7aa ctdb-daemon: Modernise debug in ctdb_add_public_address() via d62fcba7dce ctdb-daemon: Avoid spurious error sending ARPs for released IP via f5a20377347 ctdb-daemon: Modernise debug in ctdb_control_send_arp() via ec5f6425b70 ctdb-protocol: Add separator argument to ctdb_connection_to_buf() via 440bd86a992 ctdb-daemon: Drop unused ban_state element from CTDB node structure via 9898e7c5558 ctdb-recoverd: Clean up banning culprit code via 19fbc2da383 ctdb-recoverd: Add pnn field to banning state structure via 0b5dd076046 ctdb-recoverd: Add function node_flags() and use it in elections from e396eb9fbc7 ctdb-scripts: Only run unhealthy call-out when passing threshold
https://git.samba.org/?p=samba.git;a=shortlog;h=master - Log ----------------------------------------------------------------- commit 30c40046ef0b52da1dee3a65117c20da2a75955b Author: Martin Schwenke <mar...@meltin.net> Date: Fri Jul 22 11:41:57 2022 +1000 ctdb-build: Add missing dependency on talloc The include isn't strictly necessary, since it is included via common/reqid.c anyway. However, it is a useful hint. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> Autobuild-User(master): Amitay Isaacs <ami...@samba.org> Autobuild-Date(master): Fri Jul 22 17:01:00 UTC 2022 on sn-devel-184 commit e831af7b25760dbbc2a0fc5366b36cd885aac838 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Jul 22 11:05:21 2022 +1000 ctdb-tests: Work around unreadable file test failure when root root can read files for which the mode prohibits reading, so this test case fails when run as root. Work around this when running as root. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit b20ccaa36da23c8ee84b117b2e82e98bd2be4fcc Author: Martin Schwenke <mar...@meltin.net> Date: Thu Jul 21 14:22:25 2022 +1000 ctdb-scripts: Use "git config" as last resort to parse nfs.conf Some versions of nfs-utils (e.g. recent CentOS 7) use /etc/nfs.conf but do not include the nfsconf utility to extract values from the file. However, git has an excellent conf file parser, so use it as a last resort. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit db37043bc5c67e536bcaaf1941cb12ec2e72efc9 Author: Martin Schwenke <mar...@meltin.net> Date: Fri May 27 23:23:48 2022 +1000 ctdb-scripts: Avoid ShellCheck warning SC2295 For example: In /home/martins/samba/samba/ctdb/tools/onnode line 304: [ "$nodes" != "${nodes%[ ${nl}]*}" ] && verbose=true ^---^ SC2295 (info): Expansions inside ${..} need to be quoted separately, otherwise they match as patterns. Did you mean: [ "$nodes" != "${nodes%[ "${nl}"]*}" ] && verbose=true For more information: https://www.shellcheck.net/wiki/SC2295 -- Expansions inside ${..} need to b... Who knew? Thanks ShellCheck! Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 00f1d6d94764ba1312500c72fd08e7df3fae064b Author: Martin Schwenke <mar...@meltin.net> Date: Tue Jul 5 12:31:57 2022 +1000 ctdb-common: Use POSIX if_nameindex() to check interface existence This works as an unprivileged user, so avoids unnecessary errors when running in test mode (and not as root): 2022-02-18T12:21:12.436491+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket 2022-02-18T12:21:12.436534+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket 2022-02-18T12:21:12.436557+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket 2022-02-18T12:21:12.436577+11:00 node.0 ctdbd[6958]: ctdb_sys_check_iface_exists: Failed to open raw socket The corresponding porting test would now become pointless because it would just confirm that "fake" does not exist. Attempt to make it useful by using a less likely name than "fake" and attempting to detect the loopback interface. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit b686bbb4ac37296e23e74c1c10145f22b6d29d42 Author: Martin Schwenke <mar...@meltin.net> Date: Thu Jul 21 11:25:37 2022 +1000 replace: Add check for if_nameindex() Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit c77a4fde7aaa7130b969f5d49ac75abb2acfffd0 Author: Martin Schwenke <mar...@meltin.net> Date: Tue Jul 5 12:17:05 2022 +1000 ctdb-daemon: Modernise debug in ctdb_add_public_address() Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit d62fcba7dce6038c02c12b3531e953e7b808614a Author: Martin Schwenke <mar...@meltin.net> Date: Thu Jun 23 14:30:34 2022 +1000 ctdb-daemon: Avoid spurious error sending ARPs for released IP A public IP address can be released in between (and probably before) attempts to send ARPs. One situation when this can occur is when a cluster is shutting down: node A shuts down first, public IPs from node A are taken over by node B, node B is shutdown. Notice this when it occurs and cancel further attempts to send ARPs. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit f5a20377347aba18700d010d4201775fc83a0b1b Author: Martin Schwenke <mar...@meltin.net> Date: Tue Jul 5 19:33:15 2022 +1000 ctdb-daemon: Modernise debug in ctdb_control_send_arp() For the tickle ACK logging, render the connection in a buffer. This produces more complete information. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit ec5f6425b70672af591df3113962c636d8f65005 Author: Martin Schwenke <mar...@meltin.net> Date: Tue Jul 19 11:53:15 2022 +1000 ctdb-protocol: Add separator argument to ctdb_connection_to_buf() Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 440bd86a9925bd5b97fd5130e3e5a4ac104ee5dd Author: Martin Schwenke <mar...@meltin.net> Date: Wed Jul 29 13:39:03 2020 +1000 ctdb-daemon: Drop unused ban_state element from CTDB node structure Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 9898e7c5558e47c4666c552ef907a49e231dd2c7 Author: Martin Schwenke <mar...@meltin.net> Date: Wed Jul 29 13:30:04 2020 +1000 ctdb-recoverd: Clean up banning culprit code Make this fully self-contained in the recovery daemon and avoid indexing by PNN. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 19fbc2da383245522a58a222c1bca75d4ad98c8e Author: Martin Schwenke <mar...@meltin.net> Date: Wed Jul 29 12:15:03 2020 +1000 ctdb-recoverd: Add pnn field to banning state structure This structure is now standalone, so indexing by PNN can be avoided via a subsequent commit. Index by culprit here to make this commit simple. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 0b5dd076046f254bb8d60c1b4377c32a3dc59a10 Author: Martin Schwenke <mar...@meltin.net> Date: Wed Jul 29 17:57:53 2020 +1000 ctdb-recoverd: Add function node_flags() and use it in elections Indexing a node map by PNN is suboptimal. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> ----------------------------------------------------------------------- Summary of changes: ctdb/common/system.c | 38 +++--- ctdb/config/events/legacy/13.per_ip_routing.script | 2 +- ctdb/config/statd-callout | 9 +- ctdb/include/ctdb_private.h | 3 - ctdb/protocol/protocol_util.c | 20 ++- ctdb/protocol/protocol_util.h | 7 +- ctdb/server/ctdb_recoverd.c | 151 ++++++++++++++------- ctdb/server/ctdb_takeover.c | 55 +++++--- ctdb/tests/UNIT/cunit/porting_tests_001.sh | 15 +- ctdb/tests/UNIT/cunit/tunable_test_001.sh | 8 +- ctdb/tests/run_tests.sh | 6 +- ctdb/tests/scripts/integration.bash | 4 +- ctdb/tests/src/porting_tests.c | 18 +-- ctdb/tests/src/reqid_test.c | 1 + ctdb/tools/onnode | 4 +- ctdb/wscript | 2 +- lib/replace/wscript | 2 +- 17 files changed, 215 insertions(+), 130 deletions(-) Changeset truncated at 500 lines: diff --git a/ctdb/common/system.c b/ctdb/common/system.c index 650b62bab16..08dc68284fd 100644 --- a/ctdb/common/system.c +++ b/ctdb/common/system.c @@ -148,32 +148,36 @@ void ctdb_wait_for_process_to_exit(pid_t pid) } } -#ifdef HAVE_AF_PACKET +#ifdef HAVE_IF_NAMEINDEX bool ctdb_sys_check_iface_exists(const char *iface) { - int s; - struct ifreq ifr; + struct if_nameindex *ifnis, *ifni; + bool found = false; - s = socket(AF_PACKET, SOCK_RAW, 0); - if (s == -1){ - /* We don't know if the interface exists, so assume yes */ - DBG_ERR("Failed to open raw socket\n"); - return true; + ifnis = if_nameindex(); + if (ifnis == NULL) { + DBG_ERR("Failed to retrieve inteface list\n"); + return false; } - strlcpy(ifr.ifr_name, iface, sizeof(ifr.ifr_name)); - if (ioctl(s, SIOCGIFINDEX, &ifr) < 0 && errno == ENODEV) { - DBG_ERR("Interface '%s' not found\n", iface); - close(s); - return false; + for (ifni = ifnis; + ifni->if_index != 0 || ifni->if_name != NULL; + ifni++) { + int cmp = strcmp(iface, ifni->if_name); + if (cmp == 0) { + found = true; + goto done; + } } - close(s); - return true; +done: + if_freenameindex(ifnis); + + return found; } -#else /* HAVE_AF_PACKET */ +#else /* HAVE_IF_NAMEINDEX */ bool ctdb_sys_check_iface_exists(const char *iface) { @@ -181,7 +185,7 @@ bool ctdb_sys_check_iface_exists(const char *iface) return true; } -#endif /* HAVE_AF_PACKET */ +#endif /* HAVE_IF_NAMEINDEX */ #ifdef HAVE_PEERCRED diff --git a/ctdb/config/events/legacy/13.per_ip_routing.script b/ctdb/config/events/legacy/13.per_ip_routing.script index e25647613bb..d7949c6dedb 100755 --- a/ctdb/config/events/legacy/13.per_ip_routing.script +++ b/ctdb/config/events/legacy/13.per_ip_routing.script @@ -346,7 +346,7 @@ remove_bogus_routes () # be done with grep, but let's do it with shell prefix removal # to avoid unnecessary processes. This falls through if # "@${_i}@" isn't present in $_ips. - [ "$_ips" = "${_ips#*@${_i}@}" ] || continue + [ "$_ips" = "${_ips#*@"${_i}"@}" ] || continue echo "Removing ip rule/routes for unhosted public address $_i" del_routing_for_ip "$_i" diff --git a/ctdb/config/statd-callout b/ctdb/config/statd-callout index 83fb92eccf0..38c155e4793 100755 --- a/ctdb/config/statd-callout +++ b/ctdb/config/statd-callout @@ -32,8 +32,13 @@ die () load_system_config "nfs" "nfs-common" # If NFS_HOSTNAME not set then try to pull it out of /etc/nfs.conf -if [ -z "$NFS_HOSTNAME" ] && type nfsconf >/dev/null 2>&1 ; then - NFS_HOSTNAME=$(nfsconf --get statd name) +if [ -z "$NFS_HOSTNAME" ]; then + if type nfsconf >/dev/null 2>&1; then + NFS_HOSTNAME=$(nfsconf --get statd name) + elif type git >/dev/null 2>&1; then + # git to the rescue! + NFS_HOSTNAME=$(git config --file=/etc/nfs.conf statd.name) + fi fi [ -n "$NFS_HOSTNAME" ] || \ diff --git a/ctdb/include/ctdb_private.h b/ctdb/include/ctdb_private.h index 3193005db75..3395f077ab9 100644 --- a/ctdb/include/ctdb_private.h +++ b/ctdb/include/ctdb_private.h @@ -88,9 +88,6 @@ struct ctdb_node { /* a list of controls pending to this node, so we can time them out quickly if the node becomes disconnected */ struct daemon_control_state *pending_controls; - - /* used by the recovery daemon to track when a node should be banned */ - struct ctdb_banning_state *ban_state; }; /* diff --git a/ctdb/protocol/protocol_util.c b/ctdb/protocol/protocol_util.c index 28631c8de61..fe757658f48 100644 --- a/ctdb/protocol/protocol_util.c +++ b/ctdb/protocol/protocol_util.c @@ -497,8 +497,11 @@ bool ctdb_sock_addr_same(const ctdb_sock_addr *addr1, return (ctdb_sock_addr_cmp(addr1, addr2) == 0); } -int ctdb_connection_to_buf(char *buf, size_t buflen, - struct ctdb_connection *conn, bool client_first) +int ctdb_connection_to_buf(char *buf, + size_t buflen, + struct ctdb_connection *conn, + bool client_first, + const char *sep) { char server[64], client[64]; int ret; @@ -516,9 +519,9 @@ int ctdb_connection_to_buf(char *buf, size_t buflen, } if (! client_first) { - ret = snprintf(buf, buflen, "%s %s", server, client); + ret = snprintf(buf, buflen, "%s%s%s", server, sep, client); } else { - ret = snprintf(buf, buflen, "%s %s", client, server); + ret = snprintf(buf, buflen, "%s%s%s", client, sep, server); } if (ret < 0 || (size_t)ret >= buflen) { return ENOSPC; @@ -540,7 +543,7 @@ char *ctdb_connection_to_string(TALLOC_CTX *mem_ctx, return NULL; } - ret = ctdb_connection_to_buf(out, len, conn, client_first); + ret = ctdb_connection_to_buf(out, len, conn, client_first, " "); if (ret != 0) { talloc_free(out); return NULL; @@ -666,8 +669,11 @@ char *ctdb_connection_list_to_string( char buf[128]; int ret; - ret = ctdb_connection_to_buf(buf, sizeof(buf), - &conn_list->conn[i], client_first); + ret = ctdb_connection_to_buf(buf, + sizeof(buf), + &conn_list->conn[i], + client_first, + " "); if (ret != 0) { talloc_free(out); return NULL; diff --git a/ctdb/protocol/protocol_util.h b/ctdb/protocol/protocol_util.h index b01db8e9934..2bdbb0c2ad0 100644 --- a/ctdb/protocol/protocol_util.h +++ b/ctdb/protocol/protocol_util.h @@ -55,8 +55,11 @@ bool ctdb_sock_addr_same_ip(const ctdb_sock_addr *addr1, bool ctdb_sock_addr_same(const ctdb_sock_addr *addr1, const ctdb_sock_addr *addr2); -int ctdb_connection_to_buf(char *buf, size_t buflen, - struct ctdb_connection * conn, bool client_first); +int ctdb_connection_to_buf(char *buf, + size_t buflen, + struct ctdb_connection * conn, + bool client_first, + const char *sep); char *ctdb_connection_to_string(TALLOC_CTX *mem_ctx, struct ctdb_connection * conn, bool client_first); diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c index c293aa7f037..bf3a66b0aaf 100644 --- a/ctdb/server/ctdb_recoverd.c +++ b/ctdb/server/ctdb_recoverd.c @@ -237,6 +237,7 @@ static int ctdb_op_disable(struct ctdb_op_state *state, } struct ctdb_banning_state { + uint32_t pnn; uint32_t count; struct timeval last_reported_time; }; @@ -253,6 +254,7 @@ struct ctdb_recoverd { struct tevent_timer *leader_broadcast_timeout_te; uint32_t pnn; uint32_t last_culprit_node; + struct ctdb_banning_state *banning_state; struct ctdb_node_map_old *nodemap; struct timeval priority_time; bool need_takeover_run; @@ -290,6 +292,23 @@ static bool this_node_can_be_leader(struct ctdb_recoverd *rec) (rec->ctdb->capabilities & CTDB_CAP_RECMASTER) != 0; } +static bool node_flags(struct ctdb_recoverd *rec, uint32_t pnn, uint32_t *flags) +{ + size_t i; + + for (i = 0; i < rec->nodemap->num; i++) { + struct ctdb_node_and_flags *node = &rec->nodemap->nodes[i]; + if (node->pnn == pnn) { + if (flags != NULL) { + *flags = node->flags; + } + return true; + } + } + + return false; +} + /* ban a node for a period of time */ @@ -324,33 +343,75 @@ enum monitor_result { MONITOR_OK, MONITOR_RECOVERY_NEEDED, MONITOR_ELECTION_NEED /* remember the trouble maker */ -static void ctdb_set_culprit_count(struct ctdb_recoverd *rec, uint32_t culprit, uint32_t count) -{ - struct ctdb_context *ctdb = talloc_get_type(rec->ctdb, struct ctdb_context); - struct ctdb_banning_state *ban_state; +static void ctdb_set_culprit_count(struct ctdb_recoverd *rec, + uint32_t culprit, + uint32_t count) +{ + struct ctdb_context *ctdb = talloc_get_type_abort( + rec->ctdb, struct ctdb_context); + struct ctdb_banning_state *ban_state = NULL; + size_t len; + bool ok; - if (culprit > ctdb->num_nodes) { - DEBUG(DEBUG_ERR,("Trying to set culprit %d but num_nodes is %d\n", culprit, ctdb->num_nodes)); + ok = node_flags(rec, culprit, NULL); + if (!ok) { + DBG_WARNING("Unknown culprit node %"PRIu32"\n", culprit); return; } /* If we are banned or stopped, do not set other nodes as culprits */ if (rec->node_flags & NODE_FLAGS_INACTIVE) { - DEBUG(DEBUG_NOTICE, ("This node is INACTIVE, cannot set culprit node %d\n", culprit)); + D_WARNING("This node is INACTIVE, cannot set culprit node %d\n", + culprit); return; } - if (ctdb->nodes[culprit]->ban_state == NULL) { - ctdb->nodes[culprit]->ban_state = talloc_zero(ctdb->nodes[culprit], struct ctdb_banning_state); - CTDB_NO_MEMORY_VOID(ctdb, ctdb->nodes[culprit]->ban_state); + if (rec->banning_state == NULL) { + len = 0; + } else { + size_t i; + + len = talloc_array_length(rec->banning_state); - + for (i = 0 ; i < len; i++) { + if (rec->banning_state[i].pnn == culprit) { + ban_state= &rec->banning_state[i]; + break; + } + } } - ban_state = ctdb->nodes[culprit]->ban_state; - if (timeval_elapsed(&ban_state->last_reported_time) > ctdb->tunable.recovery_grace_period) { - /* this was the first time in a long while this node - misbehaved so we will forgive any old transgressions. - */ + + /* Not found, so extend (or allocate new) array */ + if (ban_state == NULL) { + struct ctdb_banning_state *t; + + len += 1; + /* + * talloc_realloc() handles the corner case where + * rec->banning_state is NULL + */ + t = talloc_realloc(rec, + rec->banning_state, + struct ctdb_banning_state, + len); + if (t == NULL) { + DBG_WARNING("Memory allocation errror"); + return; + } + rec->banning_state = t; + + /* New element is always at the end - initialise it... */ + ban_state = &rec->banning_state[len - 1]; + *ban_state = (struct ctdb_banning_state) { + .pnn = culprit, + .count = 0, + }; + } else if (ban_state->count > 0 && + timeval_elapsed(&ban_state->last_reported_time) > + ctdb->tunable.recovery_grace_period) { + /* + * Forgive old transgressions beyond the tunable time-limit + */ ban_state->count = 0; } @@ -359,6 +420,12 @@ static void ctdb_set_culprit_count(struct ctdb_recoverd *rec, uint32_t culprit, rec->last_culprit_node = culprit; } +static void ban_counts_reset(struct ctdb_recoverd *rec) +{ + D_NOTICE("Resetting ban count to 0 for all nodes\n"); + TALLOC_FREE(rec->banning_state); +} + /* remember the trouble maker */ @@ -931,28 +998,26 @@ static void cluster_lock_release(struct ctdb_recoverd *rec) static void ban_misbehaving_nodes(struct ctdb_recoverd *rec, bool *self_ban) { - struct ctdb_context *ctdb = rec->ctdb; - unsigned int i; - struct ctdb_banning_state *ban_state; + size_t len = talloc_array_length(rec->banning_state); + size_t i; + *self_ban = false; - for (i=0; i<ctdb->num_nodes; i++) { - if (ctdb->nodes[i]->ban_state == NULL) { - continue; - } - ban_state = (struct ctdb_banning_state *)ctdb->nodes[i]->ban_state; - if (ban_state->count < 2*ctdb->num_nodes) { + for (i = 0; i < len; i++) { + struct ctdb_banning_state *ban_state = &rec->banning_state[i]; + + if (ban_state->count < 2 * rec->nodemap->num) { continue; } D_NOTICE("Node %u reached %u banning credits\n", - ctdb->nodes[i]->pnn, + ban_state->pnn, ban_state->count); - ctdb_ban_node(rec, ctdb->nodes[i]->pnn); + ctdb_ban_node(rec, ban_state->pnn); ban_state->count = 0; /* Banning ourself? */ - if (ctdb->nodes[i]->pnn == rec->pnn) { + if (ban_state->pnn == rec->pnn) { *self_ban = true; } } @@ -1343,25 +1408,10 @@ static int do_recovery(struct ctdb_recoverd *rec, TALLOC_CTX *mem_ctx) rec->need_recovery = false; ctdb_op_end(rec->recovery); - /* we managed to complete a full recovery, make sure to forgive - any past sins by the nodes that could now participate in the - recovery. - */ - DEBUG(DEBUG_ERR,("Resetting ban count to 0 for all nodes\n")); - for (i=0;i<nodemap->num;i++) { - struct ctdb_banning_state *ban_state; - - if (nodemap->nodes[i].flags & NODE_FLAGS_DISCONNECTED) { - continue; - } - - ban_state = (struct ctdb_banning_state *)ctdb->nodes[nodemap->nodes[i].pnn]->ban_state; - if (ban_state == NULL) { - continue; - } - - ban_state->count = 0; - } + /* + * Completed a full recovery so forgive any past transgressions + */ + ban_counts_reset(rec); /* We just finished a recovery successfully. We now wait for rerecovery_timeout before we allow @@ -1398,6 +1448,7 @@ static void ctdb_election_data(struct ctdb_recoverd *rec, struct election_messag int ret; struct ctdb_node_map_old *nodemap; struct ctdb_context *ctdb = rec->ctdb; + bool ok; ZERO_STRUCTP(em); @@ -1410,7 +1461,11 @@ static void ctdb_election_data(struct ctdb_recoverd *rec, struct election_messag return; } - rec->node_flags = nodemap->nodes[rec->pnn].flags; + ok = node_flags(rec, rec->pnn, &rec->node_flags); + if (!ok) { + DBG_ERR("Unable to get node flags for this node\n"); + return; + } em->node_flags = rec->node_flags; for (i=0;i<nodemap->num;i++) { diff --git a/ctdb/server/ctdb_takeover.c b/ctdb/server/ctdb_takeover.c index c1e4f683784..0fb8076ad55 100644 --- a/ctdb/server/ctdb_takeover.c +++ b/ctdb/server/ctdb_takeover.c @@ -373,8 +373,17 @@ static void ctdb_control_send_arp(struct tevent_context *ev, struct ctdb_takeover_arp); int ret; struct ctdb_tcp_array *tcparray; - const char *iface = ctdb_vnn_iface_string(arp->vnn); + const char *iface; + /* IP address might have been released between sends */ + if (arp->vnn->iface == NULL) { + DBG_INFO("Cancelling ARP send for released IP %s\n", + ctdb_addr_to_str(&arp->vnn->public_address)); + talloc_free(arp); + return; + } + + iface = ctdb_vnn_iface_string(arp->vnn); ret = ctdb_sys_send_arp(&arp->addr, iface); if (ret != 0) { DBG_ERR("Failed to send ARP on interface %s: %s\n", @@ -387,19 +396,25 @@ static void ctdb_control_send_arp(struct tevent_context *ev, for (i=0;i<tcparray->num;i++) { struct ctdb_connection *tcon; + char buf[128]; tcon = &tcparray->connections[i]; - DEBUG(DEBUG_INFO,("sending tcp tickle ack for %u->%s:%u\n", - (unsigned)ntohs(tcon->dst.ip.sin_port), - ctdb_addr_to_str(&tcon->src), - (unsigned)ntohs(tcon->src.ip.sin_port))); + ret = ctdb_connection_to_buf(buf, + sizeof(buf), + tcon, + true, + " -> "); + if (ret != 0) { + strlcpy(buf, "UNKNOWN", sizeof(buf)); + } + D_INFO("Send TCP tickle ACK: %s\n", buf); ret = ctdb_sys_send_tcp( &tcon->src, &tcon->dst, 0, 0, 0); if (ret != 0) { - DEBUG(DEBUG_CRIT,(__location__ " Failed to send tcp tickle ack for %s\n", - ctdb_addr_to_str(&tcon->src))); + DBG_ERR("Failed to send TCP tickle ACK: %s\n", + buf); } } } @@ -1055,9 +1070,8 @@ static int ctdb_add_public_address(struct ctdb_context *ctdb, /* Verify that we don't have an entry for this IP yet */ for (vnn = ctdb->vnn; vnn != NULL; vnn = vnn->next) { if (ctdb_same_sockaddr(addr, &vnn->public_address)) { - DEBUG(DEBUG_ERR, - ("Duplicate public IP address '%s'\n", - ctdb_addr_to_str(addr))); + D_ERR("Duplicate public IP address '%s'\n", + ctdb_addr_to_str(addr)); return -1; } } @@ -1065,39 +1079,40 @@ static int ctdb_add_public_address(struct ctdb_context *ctdb, /* Create a new VNN structure for this IP address */ vnn = talloc_zero(ctdb, struct ctdb_vnn); if (vnn == NULL) { - DEBUG(DEBUG_ERR, (__location__ " out of memory\n")); + DBG_ERR("Memory allocation error\n"); return -1; } tmp = talloc_strdup(vnn, ifaces); if (tmp == NULL) { -- Samba Shared Repository