The branch, master has been updated via 3ac2d4b ctdb-tests: Fix some incorrect memory allocations via a7ce00c ctdb-scripts: Use more unique temporary file names via 4afe822 ctdb-scripts: Don't remove temporary files before use via 2083883 ctdb-scripts: Superficial clean-ups to 10.interface via c23744c ctdb-scripts: Clarify logic for success of interface monitoring via d8e4c5a ctdb-scripts: Fix regression in updateip code via 18b0aea ctdb-ipalloc: Fix a memory leak via 24160ee ctdb-daemon: Don't leak memory if not using recovery lock via 56ce230 ctdb-recoverd: Fix some uninitialised memory issues via 8f73ae0 ctdb-daemon: Drop the "schedule for deletion" messages to DEBUG level from 9790abd ctdb/web: Fix typo.
https://git.samba.org/?p=samba.git;a=shortlog;h=master - Log ----------------------------------------------------------------- commit 3ac2d4b59eff58f06af1eef19cef0d444f4257ca Author: Martin Schwenke <mar...@meltin.net> Date: Tue Dec 1 15:32:07 2015 +1100 ctdb-tests: Fix some incorrect memory allocations These allocate enough memory but things get confusing if they're used as a guide when updating the code. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> Autobuild-User(master): Michael Adam <ob...@samba.org> Autobuild-Date(master): Tue Jan 12 22:19:16 CET 2016 on sn-devel-144 commit a7ce00cc0392cbbc009e2dbe8d58258d6ba0566f Author: Martin Schwenke <mar...@meltin.net> Date: Thu Dec 10 15:03:46 2015 +1100 ctdb-scripts: Use more unique temporary file names Consider this sequence of events: 1. Instance of script running update_tickles() hangs 2. Script debugging is launched asynchronously 3. New instance of script is launched, creates temporary file(s) 4. Original hung script makes progress before asynchronous script debugging kills it, so it removes temporary file(s) 5. New instance of script produces error due to missing files(s) This is obviously rare. Use more unique filenames to avoid step (4) removing the file(s) belonging to other instances of the script. This requires some extra cleanup to avoid too many temporary files (which is why unique filenames were not originally usd). It is sufficient to remove files modified at least 10 minutes ago. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> commit 4afe822397a0fe8f3e528077089577c89c9895e5 Author: Martin Schwenke <mar...@meltin.net> Date: Thu Dec 10 14:58:53 2015 +1100 ctdb-scripts: Don't remove temporary files before use They will be clobbered by the redirect anyway. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> commit 20838833f3fe3053c7f6aed8ac598319a3ecc346 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Dec 18 13:16:27 2015 +1100 ctdb-scripts: Superficial clean-ups to 10.interface Whitespace and indentation improvements. Remove comments describing events, since the README covers that much better. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> commit c23744c61db321a911653d58ed258a3003cf798b Author: Martin Schwenke <mar...@meltin.net> Date: Wed Dec 16 19:18:49 2015 +1100 ctdb-scripts: Clarify logic for success of interface monitoring The current code uses so many shell idioms that it is difficult to follow. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> commit d8e4c5a468286ecc1c38ecd66a3606e84db02373 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Dec 18 15:33:38 2015 +1100 ctdb-scripts: Fix regression in updateip code Regression introduced in commit 6471541d6d2bc9f2af0ff92b280abbd1d933cf88. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> commit 18b0aeaae0ba42549a9542b8c08923211e2976ad Author: Martin Schwenke <mar...@meltin.net> Date: Tue Dec 1 14:38:48 2015 +1100 ctdb-ipalloc: Fix a memory leak Commit cfa0ffe78073f9e3a014bb127fb9a4b7ad95fceb introduced a memory leak. Never assume... Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> commit 24160ee6a4a0727840d73955b99aef690450f345 Author: Martin Schwenke <mar...@meltin.net> Date: Mon Jan 11 13:41:30 2016 +1100 ctdb-daemon: Don't leak memory if not using recovery lock Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> commit 56ce230de72dbf14ccddb3f7b26b8b7f16986dfc Author: Martin Schwenke <mar...@meltin.net> Date: Mon Jan 11 17:23:12 2016 +1100 ctdb-recoverd: Fix some uninitialised memory issues The first element of these structures is a 32-bit PNN. On 64-bit systems this field can be followed by 32-bits of padding. When the structures are copied this can cause uninitialised memory to be copied. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> commit 8f73ae03cc50f26e85b78e35bf22e40eb1ff7684 Author: Martin Schwenke <mar...@meltin.net> Date: Thu Dec 17 12:27:58 2015 +1100 ctdb-daemon: Drop the "schedule for deletion" messages to DEBUG level Thousands of these can be generated each second, rendering INFO level debugging useless. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Michael Adam <ob...@samba.org> ----------------------------------------------------------------------- Summary of changes: ctdb/config/events.d/10.interface | 47 +++++++++++++----------------------- ctdb/config/functions | 16 +++++++----- ctdb/server/ctdb_monitor.c | 1 + ctdb/server/ctdb_recover.c | 14 +++++------ ctdb/server/ctdb_recoverd.c | 2 ++ ctdb/server/ctdb_takeover.c | 4 +-- ctdb/server/ctdb_vacuum.c | 10 ++++---- ctdb/tests/src/ctdb_takeover_tests.c | 11 +++++++-- 8 files changed, 53 insertions(+), 52 deletions(-) Changeset truncated at 500 lines: diff --git a/ctdb/config/events.d/10.interface b/ctdb/config/events.d/10.interface index 045a1cf..4b9f31c 100755 --- a/ctdb/config/events.d/10.interface +++ b/ctdb/config/events.d/10.interface @@ -152,13 +152,16 @@ monitor_interfaces() done - $fail || return 0 - - $up_interfaces_found && \ - [ "$CTDB_PARTIALLY_ONLINE_INTERFACES" = "yes" ] && \ + if $fail ; then + if $up_interfaces_found && \ + [ "$CTDB_PARTIALLY_ONLINE_INTERFACES" = "yes" ] ; then + return 0 + else + return 1 + fi + else return 0 - - return 1 + fi } # Sets: iface, ip, maskbits, family @@ -191,10 +194,8 @@ get_iface_ip_maskbits_family () ctdb_check_args "$@" -case "$1" in - ############################# - # called when ctdbd starts up - init) +case "$1" in + init) # make sure that we only respond to ARP messages from the NIC where # a particular ip address is associated. get_proc sys/net/ipv4/conf/all/arp_filter >/dev/null 2>&1 && { @@ -210,17 +211,11 @@ case "$1" in drop_all_public_ips ;; - ############################# - # called after ctdbd has done its initial recovery - # and we start the services to become healthy - startup) + startup) monitor_interfaces ;; - - ################################################ - # called when ctdbd wants to claim an IP address - takeip) + takeip) iface=$2 ip=$3 maskbits=$4 @@ -239,10 +234,7 @@ case "$1" in flush_route_cache ;; - - ################################################## - # called when ctdbd wants to release an IP address - releaseip) + releaseip) # releasing an IP is a bit more complex than it seems. Once the IP # is released, any open tcp connections to that IP on this host will end # up being stuck. Some of them (such as NFS connections) will be unkillable @@ -272,9 +264,7 @@ case "$1" in flush_route_cache ;; - ################################################## - # called when ctdbd wants to update an IP address - updateip) + updateip) # moving an IP is a bit more complex than it seems. # First we drop all traffic on the old interface. # Then we try to add the ip to the new interface and before @@ -291,7 +281,7 @@ case "$1" in _ip=$4 _maskbits=$5 - get_iface_ip_maskbits_family "$_oiface" "$ip" "$maskbits" + get_iface_ip_maskbits_family "$_oiface" "$_ip" "$_maskbits" oiface="$iface" # we do an extra delete to cope with the script being killed @@ -317,12 +307,10 @@ case "$1" in # tickle all existing connections, so that dropped packets # are retransmited and the tcp streams work - tickle_tcp_connections $ip - ;; - monitor) + monitor) monitor_interfaces || exit 1 ;; *) @@ -331,4 +319,3 @@ case "$1" in esac exit 0 - diff --git a/ctdb/config/functions b/ctdb/config/functions index eef8f7e..68e53ab 100755 --- a/ctdb/config/functions +++ b/ctdb/config/functions @@ -1082,17 +1082,18 @@ update_tickles () # IPs as a regexp choice _ipschoice="($(echo $_ips | sed -e 's/ /|/g' -e 's/\./\\\\./g'))" - # Record connections to our public IPs in a temporary file - _my_connections="${tickledir}/${_port}.connections" - rm -f "$_my_connections" + # Record connections to our public IPs in a temporary file. + # This temporary file is in CTDB's private state directory and + # $$ is used to avoid a very rare race involving CTDB's script + # debugging. No security issue, nothing to see here... + _my_connections="${tickledir}/${_port}.connections.$$" netstat -tn | awk -v destpat="^${_ipschoice}:${_port}\$" \ '$1 == "tcp" && $6 == "ESTABLISHED" && $4 ~ destpat {print $5, $4}' | sort >"$_my_connections" # Record our current tickles in a temporary file - _my_tickles="${tickledir}/${_port}.tickles" - rm -f "$_my_tickles" + _my_tickles="${tickledir}/${_port}.tickles.$$" for _i in $_ips ; do ctdb -X gettickles $_i $_port | awk -F'|' 'NR > 1 { printf "%s:%s %s:%s\n", $2, $3, $4, $5 }' @@ -1111,7 +1112,10 @@ update_tickles () ctdb deltickle $_src $_dst done - rm -f "$_my_connections" "$_my_tickles" + rm -f "$_my_connections" "$_my_tickles" + + # Remove stale files from killed scripts + find "$tickledir" -type f -mmin +10 | xargs -r rm } ######################################################## diff --git a/ctdb/server/ctdb_monitor.c b/ctdb/server/ctdb_monitor.c index d8eda2a..0a8273a 100644 --- a/ctdb/server/ctdb_monitor.c +++ b/ctdb/server/ctdb_monitor.c @@ -134,6 +134,7 @@ static void ctdb_health_callback(struct ctdb_context *ctdb, int status, void *p) c.pnn = ctdb->pnn; c.old_flags = node->flags; + ZERO_STRUCT(rd); rd.pnn = ctdb->pnn; rd.srvid = CTDB_SRVID_TAKEOVER_RUN_RESPONSE; diff --git a/ctdb/server/ctdb_recover.c b/ctdb/server/ctdb_recover.c index 7b34d7e..127ff6b 100644 --- a/ctdb/server/ctdb_recover.c +++ b/ctdb/server/ctdb_recover.c @@ -592,13 +592,6 @@ int32_t ctdb_control_set_recmode(struct ctdb_context *ctdb, } } - state = talloc(ctdb, struct ctdb_set_recmode_state); - CTDB_NO_MEMORY(ctdb, state); - - state->start_time = timeval_current(); - state->fd[0] = -1; - state->fd[1] = -1; - /* release any deferred attach calls from clients */ if (recmode == CTDB_RECOVERY_NORMAL) { ctdb_process_deferred_attach(ctdb); @@ -610,6 +603,13 @@ int32_t ctdb_control_set_recmode(struct ctdb_context *ctdb, return 0; } + state = talloc(ctdb, struct ctdb_set_recmode_state); + CTDB_NO_MEMORY(ctdb, state); + + state->start_time = timeval_current(); + state->fd[0] = -1; + state->fd[1] = -1; + /* For the rest of what needs to be done, we need to do this in a child process since 1, the call to ctdb_recovery_lock() can block if the cluster diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c index 1d63526..c89649a 100644 --- a/ctdb/server/ctdb_recoverd.c +++ b/ctdb/server/ctdb_recoverd.c @@ -1650,6 +1650,7 @@ static bool do_takeover_run(struct ctdb_recoverd *rec, * wait for replies since a failure here might cause some * noise in the logs but will not actually cause a problem. */ + ZERO_STRUCT(dtr); dtr.srvid = 0; /* No reply */ dtr.pnn = -1; @@ -3202,6 +3203,7 @@ static int verify_local_ip_allocation(struct ctdb_context *ctdb, struct ctdb_rec DEBUG(DEBUG_CRIT,("Trigger takeoverrun\n")); + ZERO_STRUCT(rd); rd.pnn = ctdb->pnn; rd.srvid = 0; data.dptr = (uint8_t *)&rd; diff --git a/ctdb/server/ctdb_takeover.c b/ctdb/server/ctdb_takeover.c index 8de1666..227bd16 100644 --- a/ctdb/server/ctdb_takeover.c +++ b/ctdb/server/ctdb_takeover.c @@ -1434,7 +1434,7 @@ static int ctdb_reload_remote_public_ips(struct ctdb_context *ctdb, ret = ctdb_ctrl_get_public_ips_flags(ctdb, TAKEOVER_TIMEOUT(), j, - ctdb->nodes, + ipalloc_state->known_public_ips, 0, &ipalloc_state->known_public_ips[j]); if (ret != 0) { @@ -1454,7 +1454,7 @@ static int ctdb_reload_remote_public_ips(struct ctdb_context *ctdb, ret = ctdb_ctrl_get_public_ips_flags(ctdb, TAKEOVER_TIMEOUT(), j, - ctdb->nodes, + ipalloc_state->available_public_ips, CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE, &ipalloc_state->available_public_ips[j]); if (ret != 0) { diff --git a/ctdb/server/ctdb_vacuum.c b/ctdb/server/ctdb_vacuum.c index 54dfe99..f536129 100644 --- a/ctdb/server/ctdb_vacuum.c +++ b/ctdb/server/ctdb_vacuum.c @@ -1659,11 +1659,11 @@ static int insert_record_into_delete_queue(struct ctdb_db_context *ctdb_db, hash = (uint32_t)ctdb_hash(&key); - DEBUG(DEBUG_INFO, (__location__ " schedule for deletion: db[%s] " - "db_id[0x%08x] " - "key_hash[0x%08x] " - "lmaster[%u] " - "migrated_with_data[%s]\n", + DEBUG(DEBUG_DEBUG, (__location__ " schedule for deletion: db[%s] " + "db_id[0x%08x] " + "key_hash[0x%08x] " + "lmaster[%u] " + "migrated_with_data[%s]\n", ctdb_db->db_name, ctdb_db->db_id, hash, ctdb_lmaster(ctdb_db->ctdb, &key), diff --git a/ctdb/tests/src/ctdb_takeover_tests.c b/ctdb/tests/src/ctdb_takeover_tests.c index 6797de4..bbd004e 100644 --- a/ctdb/tests/src/ctdb_takeover_tests.c +++ b/ctdb/tests/src/ctdb_takeover_tests.c @@ -196,7 +196,12 @@ read_ctdb_public_ip_info(TALLOC_CTX *ctx , while (t != NULL) { n = (int) strtol(t, (char **) NULL, 10); if ((*known)[n] == NULL) { - (*known)[n] = talloc_array(ctx, struct ctdb_public_ip_list_old, CTDB_TEST_MAX_IPS); + /* Array size here has to be + * CTDB_TEST_MAX_IPS because total + * number of IPs isn't yet known */ + (*known)[n] = talloc_size(ctx, + offsetof(struct ctdb_public_ip_list_old, ips) + + CTDB_TEST_MAX_IPS * sizeof(struct ctdb_public_ip)); (*known)[n]->num = 0; } curr = (*known)[n]->num; @@ -210,7 +215,9 @@ read_ctdb_public_ip_info(TALLOC_CTX *ctx , } /* Build list of all allowed IPs */ - a = talloc_array(ctx, struct ctdb_public_ip_list_old, CTDB_TEST_MAX_IPS); + a = talloc_size(ctx, + offsetof(struct ctdb_public_ip_list_old, ips) + + numips * sizeof(struct ctdb_public_ip)); a->num = numips; for (ta = *all_ips, i=0; ta != NULL && i < numips ; ta = ta->next, i++) { a->ips[i].pnn = ta->pnn; -- Samba Shared Repository