The branch, master has been updated
via 4569c652881 ctdb-scripts: Add configuration variable
CTDB_KILLTCP_USE_SS_KILL
via 19e65f4012f ctdb-scripts: Factor out function kill_tcp_summarise()
via 590a86dbe4a ctdb-scripts: Track connections for all ports for
public IPs
via c3695722b63 ctdb-scripts: Get connections after tickle list
via 9683bb3ac2b ctdb-scripts: Move connection tracking to 10.interface
via d39a1cc1d4f ctdb-server: Use ctdb_connection_same() to simplify
via 1b1fd5c2280 ctdb: Don't leak a pointer on talloc_realloc failure
via e080add68ab ctdb: Save a few lines with talloc_zero()
via 762f5f5ca63 ctdb-server: Remove duplicate logic
via 5af8627feb8 ctdb-server: Handle pre-existing connection first
via 9838b4d0d6c ctdb-server: Drop an unnecessary variable
via f4a8f84328c ctdb-server: Drop a log message to DEBUG level
via 3c19c8df778 ctdb-server: Clean up connection tracking functions
via 0505d06b12a ctdb-scripts: Use ss -H option to simplify
via 32e4f786601 ctdb-scripts: Remove superseded compatibility code
via b3e2c69ad92 ctdb-scripts: update_tickles() should use the public
IPs cache
via 1a4a6c46f1c ctdb-scripts: Don't list connections when not hosting
IPs
via 3410eddd932 ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn"
via 025bd34dfcf ctdb-doc: Improve 10.interface documentation and
comments
via 60067e2a74d ctdb-tests: Fix ss -a not supported
via 4817e32c1da ctdb-tests: Drop unsupported long options from ss stub
usage
via 557b0342002 ctdb-tests: Ensure ss stub handles square brackets
around addresses
from 982042115b1 libndr: specialise ndr_token_find() for key pointer
comparison
https://git.samba.org/?p=samba.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit 4569c65288177969ca1e4d9bd6badec60552beb9
Author: Martin Schwenke <[email protected]>
Date: Tue Aug 22 12:13:44 2023 +1000
ctdb-scripts: Add configuration variable CTDB_KILLTCP_USE_SS_KILL
This allows CTDB to be configured to use "ss -K" to reset TCP
connections on "releaseip". This is only supported when the kernel is
configured with CONFIG_INET_DIAG_DESTROY enabled.
From the documentation:
ss -K has been supported in ss since iproute 4.5 in March 2016 and
in the Linux kernel since 4.4 in December 2015. However, the
required kernel configuration item CONFIG_INET_DIAG_DESTROY is
disabled by default. Although enabled in Debian kernels since
~2017 and in Ubuntu since at least 18.04,, this has only recently
been enabled in distributions such as RHEL. There seems to be no
way, including running ss -K, to determine if this is supported, so
use of this feature needs to be configurable. When available, it
should be the fastest, most reliable way of killing connections.
For RHEL and derivatives, this was enabled as follows:
* RHEL 8 via https://bugzilla.redhat.com/show_bug.cgi?id=2230213,
arriving in version kernel-4.18.0-513.5.1.el8_9
* RHEL 9 via https://issues.redhat.com/browse/RHEL-212, arriving in
kernel-5.14.0-360.el9
Enabling this option results in a small behaviour change because ss -K
always does a 2-way kill (i.e. it also sends a RST to the client).
Only a 1-way kill is done for SMB connections when ctdb_killtcp is
used - the reasons for this are shrouded in history and the 2-way kill
seems to work fine.
For the summary that is logged, when CTDB_KILLTCP_USE_SS_KILL is "yes"
or "try", always log the method used, even the fallback to
ctdb_killtcp. However, when set to "no", maintain the existing
output.
The decision to use -K rather than --kill is because short options are
trivial to implement in test stubs.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
Autobuild-User(master): Martin Schwenke <[email protected]>
Autobuild-Date(master): Thu Nov 7 00:12:34 UTC 2024 on atb-devel-224
commit 19e65f4012f286b279dbefeae74500d867592a27
Author: Martin Schwenke <[email protected]>
Date: Fri Aug 25 10:00:57 2023 +1000
ctdb-scripts: Factor out function kill_tcp_summarise()
This will be used in a slightly different context in a subsequent
commit. In that case, the number of killed connections will be passed
instead of the total number of connections, so support this here via
different modes instead of churning later.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 590a86dbe4adf45ac8d15497934e25ea98148034
Author: Martin Schwenke <[email protected]>
Date: Mon Oct 23 14:17:36 2023 +1100
ctdb-scripts: Track connections for all ports for public IPs
Currently TCP ports like NFS lock manager are not tracked. It is
easier to track all connections than to add a configuration system to
try to track specified ports, so do that.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit c3695722b6316b624aa6c44cad4f44279303d1b1
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 30 10:50:00 2024 +1000
ctdb-scripts: Get connections after tickle list
Running ss to get current connections before running ctdb gettickles
means the ss output might be out of date when the 2 lists are
compared. Some tickles might have been added after ss was run by some
other means (e.g. SMB tickles, added internally) and they would be
deleted according to the stale ss output.
This isn't currently a problem because update_tickles() is currently
only called with port 2049, so all tickles are managed by this code.
That will change in a subsequent commit.
Changing the order means the reverse problem can occur, where
update_tickles() attempts to delete an already deleted tickle. That
may happen occasionally but is harmless because it doesn't result in
missing information. It (currently) just causes a message to be
logged at DEBUG level.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 9683bb3ac2bbdf0e83c3be3681f9d1c8ee7cc327
Author: Martin Schwenke <[email protected]>
Date: Mon Oct 23 14:05:21 2023 +1100
ctdb-scripts: Move connection tracking to 10.interface
This should really be done for all connections to public IP addresses.
Leave the port number there for now - this is just the first step.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit d39a1cc1d4f874e398f87a6778a868ec1f9178eb
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 30 12:21:59 2024 +1000
ctdb-server: Use ctdb_connection_same() to simplify
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 1b1fd5c2280ee7f5a3caba0779bf5208c11359db
Author: Volker Lendecke <[email protected]>
Date: Wed Nov 6 11:51:04 2024 +0100
ctdb: Don't leak a pointer on talloc_realloc failure
We should not directly overwrite the pointer we are realloc'ing
Signed-off-by: Volker Lendecke <[email protected]>
Reviewed-by: Martin Schwenke <[email protected]>
commit e080add68ab748a290533ec1fcb97c6aef319418
Author: Volker Lendecke <[email protected]>
Date: Wed Nov 6 11:49:36 2024 +0100
ctdb: Save a few lines with talloc_zero()
Signed-off-by: Volker Lendecke <[email protected]>
Reviewed-by: Martin Schwenke <[email protected]>
commit 762f5f5ca6350cc0b93c71f06abc963e13793e0e
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 30 12:40:57 2024 +1000
ctdb-server: Remove duplicate logic
Initialise the pointer to NULL and fall through to let
talloc_realloc() do the allocation. talloc_realloc() does the right
thing with a NULL pointer...
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 5af8627feb805b65b9bf28a295f2f7f81f5f8826
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 30 12:37:57 2024 +1000
ctdb-server: Handle pre-existing connection first
This is cheap when tcparray is NULL and let's the code that now
follows be simplified.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 9838b4d0d6cfdcf87c5aa6eac2252dd1579173cf
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 30 12:34:18 2024 +1000
ctdb-server: Drop an unnecessary variable
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit f4a8f84328c5e692ce63bec05bb71fcb469a3e9c
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 30 12:30:13 2024 +1000
ctdb-server: Drop a log message to DEBUG level
This is harmless, so it doesn't generally need to be logged.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 3c19c8df778070705485b3c993e695ca1636bfa7
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 30 12:22:46 2024 +1000
ctdb-server: Clean up connection tracking functions
Apply README.Coding, modernise logging, pre-render connection as a
string for logging, switch terminology from "tickle" to "connection",
tidy up comments.
No changes in functionality.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 0505d06b12a04a5c5e813fb3f4799278f9e5b7eb
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 16 12:26:53 2024 +1000
ctdb-scripts: Use ss -H option to simplify
This option has been available since ~2018 and has been implemented in
the stub since then. I guess we didn't use it because CentOS 7?
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 32e4f786601712e57992ce4c8f46e5d38620a5dd
Author: Martin Schwenke <[email protected]>
Date: Mon Oct 23 14:23:45 2023 +1100
ctdb-scripts: Remove superseded compatibility code
Since commit 224e99804efef960ef4ce2ff2f4f6dced1e74146, square brackets
have been parsed by daemon and tool code, so drop the compatibility
code from here.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit b3e2c69ad92c0d20bb10146d2dd6d0d475455298
Author: Martin Schwenke <[email protected]>
Date: Thu Sep 19 14:32:46 2024 +1000
ctdb-scripts: update_tickles() should use the public IPs cache
This avoids duplicating logic.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 1a4a6c46f1cdabfea67c264d6576a597a70c3007
Author: Martin Schwenke <[email protected]>
Date: Thu Sep 19 13:52:48 2024 +1000
ctdb-scripts: Don't list connections when not hosting IPs
With an empty IP filter, all incoming connections to port 2049 will be
listed, not just those to public IP addresses. This causes error
messages like the following to be logged:
ctdb-eventd[...]: 60.nfs: Failed to add 1 tickles
since the connection being added seems to be for a random NFS mount
that doesn't use a public IP addresses.
This has been a problem for a long time (probably since commit
04fe9e20749985c71fef1bce7f6e4c439fe11c81 in 2015). It isn't currently
a huge deal because it only affects NFS connections. However, this
code will soon be used to track connections to public IP addresses on
all ports. This would result in a constant stream of log messages,
since there will always be some active connections.
The theory behind the fix is that if a node hosts no public IPs then
it should have no relevant connections and has no business changing
the list of registered tickles.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 3410eddd932b430acc687c81a5dc6e62a0a420a6
Author: Martin Schwenke <[email protected]>
Date: Fri Sep 13 16:21:24 2024 +1000
ctdb-scripts: Reformat with "shfmt -w -p -i 0 -fn"
Massage a couple of lines manually so they're formatted sanely given
the new indentation. Re-run shfmt to ensure no further changes.
Best reviewed with "git show -w".
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 025bd34dfcf790d06080501f0263667506137736
Author: Martin Schwenke <[email protected]>
Date: Tue Aug 22 12:12:50 2023 +1000
ctdb-doc: Improve 10.interface documentation and comments
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 60067e2a74d58d9b31a5eef657ec33fbdc7ec514
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 16 12:32:02 2024 +1000
ctdb-tests: Fix ss -a not supported
This is currently just a series of typos.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 4817e32c1da5e9d6f0e3594e67f1d2bed66463ac
Author: Martin Schwenke <[email protected]>
Date: Mon Sep 16 12:19:00 2024 +1000
ctdb-tests: Drop unsupported long options from ss stub usage
These have not been supported since commit
896c77df1ce2645c6dd7898b59ea802e204dc7d9 in 2018.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
commit 557b034200269aadb5c23d53207a988fc313c97f
Author: Martin Schwenke <[email protected]>
Date: Fri Oct 27 11:06:23 2023 +1100
ctdb-tests: Ensure ss stub handles square brackets around addresses
It isn't unreasonable for unit test cases to use square brackets in
their input.
Signed-off-by: Martin Schwenke <[email protected]>
Reviewed-by: Volker Lendecke <[email protected]>
Reviewed-by: Jerry Heyman <[email protected]>
-----------------------------------------------------------------------
Summary of changes:
ctdb/config/events/legacy/10.interface.script | 124 ++++++++++----------
ctdb/config/events/legacy/60.nfs.script | 1 -
ctdb/config/functions | 118 +++++++++++--------
ctdb/doc/ctdb-script.options.5.xml | 94 ++++++++++++++-
ctdb/server/ctdb_takeover.c | 141 ++++++++++++-----------
ctdb/tests/UNIT/eventscripts/10.interface.020.sh | 27 +++++
ctdb/tests/UNIT/eventscripts/10.interface.021.sh | 32 +++++
ctdb/tests/UNIT/eventscripts/10.interface.022.sh | 35 ++++++
ctdb/tests/UNIT/eventscripts/10.interface.023.sh | 40 +++++++
ctdb/tests/UNIT/eventscripts/10.interface.030.sh | 27 +++++
ctdb/tests/UNIT/eventscripts/10.interface.031.sh | 35 ++++++
ctdb/tests/UNIT/eventscripts/10.interface.032.sh | 40 +++++++
ctdb/tests/UNIT/eventscripts/10.interface.033.sh | 52 +++++++++
ctdb/tests/UNIT/eventscripts/stubs/ss | 37 ++++--
14 files changed, 615 insertions(+), 188 deletions(-)
create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.020.sh
create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.021.sh
create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.022.sh
create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.023.sh
create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.030.sh
create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.031.sh
create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.032.sh
create mode 100755 ctdb/tests/UNIT/eventscripts/10.interface.033.sh
Changeset truncated at 500 lines:
diff --git a/ctdb/config/events/legacy/10.interface.script
b/ctdb/config/events/legacy/10.interface.script
index 9aa067b4a61..8d2d6968a1d 100755
--- a/ctdb/config/events/legacy/10.interface.script
+++ b/ctdb/config/events/legacy/10.interface.script
@@ -1,11 +1,9 @@
#!/bin/sh
-#################################
-# interface event script for ctdb
-# this adds/removes IPs from your
-# public interface
+# Handle public IP address release and takeover, as well as monitoring
+# interfaces used by public IP addresses.
-[ -n "$CTDB_BASE" ] || \
+[ -n "$CTDB_BASE" ] ||
CTDB_BASE=$(d=$(dirname "$0") && cd -P "$d" && dirname "$PWD")
. "${CTDB_BASE}/functions"
@@ -13,7 +11,7 @@
load_script_options
if ! have_public_addresses; then
- if [ "$1" = "init" ] ; then
+ if [ "$1" = "init" ]; then
echo "No public addresses file found"
fi
exit 0
@@ -32,8 +30,8 @@ monitor_interfaces()
#
# public_ifaces set by get_public_ifaces() above
# shellcheck disable=SC2154
- for _iface in $public_ifaces ; do
- if interface_monitor "$_iface" ; then
+ for _iface in $public_ifaces; do
+ if interface_monitor "$_iface"; then
up_interfaces_found=true
$CTDB setifacelink "$_iface" up >/dev/null 2>&1
else
@@ -42,11 +40,11 @@ monitor_interfaces()
fi
done
- if ! $down_interfaces_found ; then
+ if ! $down_interfaces_found; then
return 0
fi
- if ! $up_interfaces_found ; then
+ if ! $up_interfaces_found; then
return 1
fi
@@ -58,63 +56,61 @@ monitor_interfaces()
}
# Sets: iface, ip, maskbits
-get_iface_ip_maskbits ()
+get_iface_ip_maskbits()
{
- _iface_in="$1"
- ip="$2"
- _maskbits_in="$3"
-
- # Intentional word splitting here
- # shellcheck disable=SC2046
- set -- $(ip_maskbits_iface "$ip")
- if [ -n "$1" ] ; then
- maskbits="$1"
- iface="$2"
-
- if [ "$iface" != "$_iface_in" ] ; then
- printf \
- 'WARNING: Public IP %s hosted on interface %s but VNN says
%s\n' \
- "$ip" "$iface" "$_iface_in"
- fi
- if [ "$maskbits" != "$_maskbits_in" ] ; then
- printf \
- 'WARNING: Public IP %s has %s bit netmask but VNN says %s\n' \
- "$ip" "$maskbits" "$_maskbits_in"
+ _iface_in="$1"
+ ip="$2"
+ _maskbits_in="$3"
+
+ # Intentional word splitting here
+ # shellcheck disable=SC2046
+ set -- $(ip_maskbits_iface "$ip")
+ if [ -n "$1" ]; then
+ maskbits="$1"
+ iface="$2"
+
+ if [ "$iface" != "$_iface_in" ]; then
+ printf 'WARNING: Public IP %s hosted on interface %s
but VNN says %s\n' \
+ "$ip" "$iface" "$_iface_in"
+ fi
+ if [ "$maskbits" != "$_maskbits_in" ]; then
+ printf 'WARNING: Public IP %s has %s bit netmask but
VNN says %s\n' \
+ "$ip" "$maskbits" "$_maskbits_in"
+ fi
+ else
+ die "ERROR: Unable to determine interface for IP ${ip}"
fi
- else
- die "ERROR: Unable to determine interface for IP ${ip}"
- fi
}
-ip_block ()
+ip_block()
{
_ip="$1"
_iface="$2"
case "$_ip" in
*:*) _family="inet6" ;;
- *) _family="inet" ;;
+ *) _family="inet" ;;
esac
# Extra delete copes with previously killed script
iptables_wrapper "$_family" \
- -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null
+ -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null
iptables_wrapper "$_family" \
- -I INPUT -i "$_iface" -d "$_ip" -j DROP
+ -I INPUT -i "$_iface" -d "$_ip" -j DROP
}
-ip_unblock ()
+ip_unblock()
{
_ip="$1"
_iface="$2"
case "$_ip" in
*:*) _family="inet6" ;;
- *) _family="inet" ;;
+ *) _family="inet" ;;
esac
iptables_wrapper "$_family" \
- -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null
+ -D INPUT -i "$_iface" -d "$_ip" -j DROP 2>/dev/null
}
ctdb_check_args "$@"
@@ -122,11 +118,11 @@ ctdb_check_args "$@"
case "$1" in
init)
_promote="sys/net/ipv4/conf/all/promote_secondaries"
- get_proc "$_promote" >/dev/null 2>&1 || \
- die "Public IPs only supported if promote_secondaries is available"
+ get_proc "$_promote" >/dev/null 2>&1 ||
+ die "Public IPs only supported if promote_secondaries is
available"
- # make sure we drop any ips that might still be held if
- # previous instance of ctdb got killed with -9 or similar
+ # Make sure we drop any IPs that might still be held if
+ # previous instance of ctdbd got killed with -9 or similar
drop_all_public_ips
;;
@@ -146,7 +142,7 @@ takeip)
update_my_public_ip_addresses "takeip" "$ip"
add_ip_to_iface "$iface" "$ip" "$maskbits" || {
- exit 1;
+ exit 1
}
# In case a previous "releaseip" for this IP was killed...
@@ -156,12 +152,15 @@ takeip)
;;
releaseip)
- # releasing an IP is a bit more complex than it seems. Once the IP
- # is released, any open tcp connections to that IP on this host will end
- # up being stuck. Some of them (such as NFS connections) will be
unkillable
- # so we need to use the killtcp ctdb function to kill them off. We also
- # need to make sure that no new connections get established while we are
- # doing this! So what we do is this:
+ # Releasing an IP is a bit more complex than it seems. Once
+ # the IP is released, any open TCP connections to that IP on
+ # this host will end up being stuck. Some of them (such as NFS
+ # connections) will be unkillable so we need to terminate
+ # them. We also need to make sure that no new connections get
+ # established while we are doing this.
+ #
+ # The steps are:
+ #
# 1) firewall this IP, so no new external packets arrive for it
# 2) find existing connections, and kill them
# 3) remove the IP from the interface
@@ -186,17 +185,20 @@ releaseip)
;;
updateip)
- # moving an IP is a bit more complex than it seems.
- # First we drop all traffic on the old interface.
- # Then we try to add the ip to the new interface and before
- # we finally remove it from the old interface.
+ # Moving an IP is a bit more complex than it seems. First we
+ # drop all traffic on the old interface. Then we try to
+ # remove the IP from the old interface and add it to the new
+ # interface.
+ #
+ # The steps are:
#
# 1) firewall this IP, so no new external packets arrive for it
# 2) remove the IP from the old interface (and new interface, to be
sure)
# 3) add the IP to the new interface
# 4) remove the firewall rule
# 5) use ctdb gratarp to propagate the new mac address
- # 6) use netstat -tn to find existing connections, and tickle them
+ # 6) send tickle ACKs for existing connections, so dropped
+ # packets are resent
_oiface=$2
niface=$3
_ip=$4
@@ -207,7 +209,7 @@ updateip)
# Could check maskbits too. However, that should never change
# so we want to notice if it does.
- if [ "$oiface" = "$niface" ] ; then
+ if [ "$oiface" = "$niface" ]; then
echo "Redundant \"updateip\" - ${ip} already on ${niface}"
exit 0
fi
@@ -226,10 +228,10 @@ updateip)
flush_route_cache
- # propagate the new mac address
+ # Propagate the new MAC address
$CTDB gratarp "$ip" "$niface"
- # tickle all existing connections, so that dropped packets
+ # Tickle all existing connections, so that dropped packets
# are retransmitted and the tcp streams work
tickle_tcp_connections "$ip"
;;
@@ -241,6 +243,8 @@ ipreallocated)
monitor)
monitor_interfaces || exit 1
+
+ update_tickles
;;
esac
diff --git a/ctdb/config/events/legacy/60.nfs.script
b/ctdb/config/events/legacy/60.nfs.script
index bc5be241f67..b797ada9370 100755
--- a/ctdb/config/events/legacy/60.nfs.script
+++ b/ctdb/config/events/legacy/60.nfs.script
@@ -352,7 +352,6 @@ monitor)
exit $?
fi
- update_tickles 2049
nfs_update_lock_info
nfs_check_services
diff --git a/ctdb/config/functions b/ctdb/config/functions
index f8f539ad53f..1ca3cebbbca 100755
--- a/ctdb/config/functions
+++ b/ctdb/config/functions
@@ -499,7 +499,7 @@ ctdb_check_unix_socket()
return 1
fi
- _out=$(ss -l -x "src ${_sockpath}" | tail -n +2)
+ _out=$(ss -l -xH "src ${_sockpath}")
if [ -z "$_out" ]; then
echo "ERROR: ${service_name} not listening on ${_sockpath}"
return 1
@@ -509,6 +509,43 @@ ctdb_check_unix_socket()
################################################
# kill off any TCP connections with the given IP
################################################
+
+kill_tcp_summarise()
+{
+ _mode="$1"
+ _count="$2"
+ _method="$3"
+
+ _connections=$(get_tcp_connections_for_ip "$_ip")
+ if [ -z "$_connections" ]; then
+ _remaining=0
+ else
+ _remaining=$(echo "$_connections" | wc -l)
+ fi
+
+ case "$_mode" in
+ total)
+ _total="$_count"
+ _killed=$((_total - _remaining))
+ ;;
+ killed)
+ _killed="$_count"
+ _total=$((_killed + _remaining))
+ ;;
+ esac
+
+ _t="${_killed}/${_total}"
+ _m=""
+ if [ -n "$_method" ]; then
+ _m=", using ${_method}"
+ fi
+ echo "Killed ${_t} TCP connections to released IP ${_ip}${_m}"
+ if [ -n "$_connections" ]; then
+ echo "Remaining connections:"
+ echo "$_connections" | sed -e 's|^| |'
+ fi
+}
+
kill_tcp_connections()
{
_iface="$1"
@@ -519,6 +556,16 @@ kill_tcp_connections()
_oneway=true
fi
+ case "$CTDB_KILLTCP_USE_SS_KILL" in
+ yes | try)
+ _killcount=$(ss -K -tnH state established src "$_ip" | wc -l)
+ kill_tcp_summarise "killed" "$_killcount" "ss -K"
+ if [ "$CTDB_KILLTCP_USE_SS_KILL" = "yes" ]; then
+ return
+ fi
+ ;;
+ esac
+
get_tcp_connections_for_ip "$_ip" | {
_killcount=0
_connections=""
@@ -556,22 +603,11 @@ kill_tcp_connections()
return
}
- _connections=$(get_tcp_connections_for_ip "$_ip")
- if [ -z "$_connections" ]; then
- _remaining=0
- else
- _remaining=$(echo "$_connections" | wc -l)
- fi
-
- _actually_killed=$((_killcount - _remaining))
-
- _t="${_actually_killed}/${_killcount}"
- echo "Killed ${_t} TCP connections to released IP $_ip"
-
- if [ -n "$_connections" ]; then
- echo "Remaining connections:"
- echo "$_connections" | sed -e 's|^| |'
+ _method=""
+ if [ "$CTDB_KILLTCP_USE_SS_KILL" = "try" ]; then
+ _method="ctdb_killtcp"
fi
+ kill_tcp_summarise "total" "$_killcount" "$_method"
}
}
@@ -602,7 +638,7 @@ get_tcp_connections_for_ip()
{
_ip="$1"
- ss -tn state established "src [$_ip]" | awk 'NR > 1 {print $3, $4}'
+ ss -tnH state established "src [$_ip]" | awk '{print $3, $4}'
}
########################################################
@@ -1181,49 +1217,39 @@ nfs_callout()
update_tickles()
{
- _port="$1"
-
tickledir="${CTDB_SCRIPT_VARDIR}/tickles"
mkdir -p "$tickledir"
- # What public IPs do I hold?
- _pnn=$(ctdb_get_pnn)
- _ips=$($CTDB -X ip | awk -F'|' -v pnn="$_pnn" '$3 == pnn {print $2}')
+ # If not hosting any public IPs then can't have any connections...
+ if [ ! -s "$CTDB_MY_PUBLIC_IPS_CACHE" ]; then
+ return
+ fi
- # IPs and port as ss filters
+ # IPs ss filter
_ip_filter=""
- for _ip in $_ips; do
+ while read -r _ip; do
_ip_filter="${_ip_filter}${_ip_filter:+ || }src [${_ip}]"
- done
- _port_filter="sport == :${_port}"
+ done <"$CTDB_MY_PUBLIC_IPS_CACHE"
+
+ # Record our current tickles in a temporary file
+ _my_tickles="${tickledir}/all.tickles.$$"
+ while read -r _i; do
+ $CTDB -X gettickles "$_i" |
+ awk -F'|' 'NR > 1 { printf "%s:%s %s:%s\n", $2, $3, $4,
$5 }'
+ done <"$CTDB_MY_PUBLIC_IPS_CACHE" |
+ sort >"$_my_tickles"
# Record connections to our public IPs in a temporary file.
# This temporary file is in CTDB's private state directory and
# $$ is used to avoid a very rare race involving CTDB's script
# debugging. No security issue, nothing to see here...
- _my_connections="${tickledir}/${_port}.connections.$$"
- # Parentheses are needed around the filters for precedence but
+ _my_connections="${tickledir}/all.connections.$$"
+ # Parentheses are needed around the IP filter for precedence but
# the parentheses can't be empty!
- #
- # Recent versions of ss print square brackets around IPv6
- # addresses. While it is desirable to update CTDB's address
- # parsing and printing code, something needs to be done here
- # for backward compatibility, so just delete the brackets.
- ss -tn state established \
- "${_ip_filter:+( ${_ip_filter} )}" \
- "${_port_filter:+( ${_port_filter} )}" |
- awk 'NR > 1 {print $4, $3}' |
- tr -d '][' |
+ ss -tnH state established "${_ip_filter:+( ${_ip_filter} )}" |
+ awk '{print $4, $3}' |
sort >"$_my_connections"
- # Record our current tickles in a temporary file
- _my_tickles="${tickledir}/${_port}.tickles.$$"
- for _i in $_ips; do
- $CTDB -X gettickles "$_i" "$_port" |
- awk -F'|' 'NR > 1 { printf "%s:%s %s:%s\n", $2, $3, $4,
$5 }'
- done |
- sort >"$_my_tickles"
-
# Add tickles for connections that we haven't already got tickles for
comm -23 "$_my_connections" "$_my_tickles" |
$CTDB addtickle
diff --git a/ctdb/doc/ctdb-script.options.5.xml
b/ctdb/doc/ctdb-script.options.5.xml
index 11597097a04..9298f9f3498 100644
--- a/ctdb/doc/ctdb-script.options.5.xml
+++ b/ctdb/doc/ctdb-script.options.5.xml
@@ -105,12 +105,102 @@
<title>10.interface</title>
<para>
- This event script handles monitoring of interfaces using by
- public IP addresses.
+ This event script handles public IP address release and
+ takeover, as well as monitoring interfaces used by public IP
+ addresses.
</para>
<variablelist>
+ <varlistentry>
+ <term>
+ CTDB_KILLTCP_USE_SS_KILL=yes|try|no
+ </term>
+ <listitem>
+ <para>
+ Whether to use <command>ss -K/--kill</command> to reset
+ incoming TCP connections to public IP addresses during
+ <command>releaseip</command>.
+ </para>
+
+ <para>
+ CTDB's standard method of resetting incoming TCP
+ connections during <command>releaseip</command> is via
+ its custom <command>ctdb_killtcp</command> command.
+ This uses network trickery to reset each connection:
+ send a "tickle ACK", capture the reply to extract the
+ TCP sequence number, send a reset (containing the
+ correct sequence number).
+ </para>
+
+ <para>
+ <command>ss -K</command> has been supported in
+ <command>ss</command> since iproute 4.5 in March 2016
+ and in the Linux kernel since 4.4 in December 2015.
+ However, the required kernel configuration item
+ <code>CONFIG_INET_DIAG_DESTROY</code> is disabled by
+ default. Although enabled in Debian kernels since ~2017
+ and in Ubuntu since at least 18.04, this has only
+ recently been enabled in distributions such as RHEL.
+ There seems to be no way, including running <command>ss
+ -K</command>, to determine if this is supported, so use
+ of this feature needs to be configurable. When
+ available, it should be the fastest, most reliable way
+ of killing connections.
+ </para>
+
+ <para>
+ Supported values are:
+ <variablelist>
+ <varlistentry>
+ <term>
--
Samba Shared Repository