The branch, master has been updated via 5a9e338330f ctdb-tests: Don't clean up test var directory in autotest target via a2ab6485e02 ctdb-tests: Fix usage message via 3cb53a7a054 ctdb-tests: Wait to allow database attach/detach to take effect via 066cc5b0c56 ctdb-tests: Avoid bulk output in $out, prefer $outfile via 9d02452a246 ctdb-tests: Make try_command_on_node less error-prone via 7c3819d1ac2 ctdb-tests: Change sanity_check_output() to internally use $out via b80967f5dcc ctdb-scripts: Drop script configuration variable CTDB_MONITOR_SWAP_USAGE via 8108b3134c0 ctdb-tests: Extend test to cover ctdb rddumpmemory via f78d9388fb4 ctdb-tools: Fix ctdb dumpmemory to avoid printing trailing NUL via 95477e69e3e ctdb-daemon: Log when ctdbd CPU utilisation exceeds a threshold via 87032ccebdd ctdb-build: Add check for getrusage() from 3d42e257a61 s4 dns_server Bind9: Log opertion durations
https://git.samba.org/?p=samba.git;a=shortlog;h=master - Log ----------------------------------------------------------------- commit 5a9e338330fe136908a3a17a5df81c054c5cc5b0 Author: Martin Schwenke <mar...@meltin.net> Date: Wed May 1 15:17:14 2019 +1000 ctdb-tests: Don't clean up test var directory in autotest target If the directory is always cleaned up then it is not possible to look at daemon logs to debug test failures. This target is only really used by autobuild.py, which (optionally) cleans up the parent directory anyway. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> Autobuild-User(master): Amitay Isaacs <ami...@samba.org> Autobuild-Date(master): Tue May 7 06:56:01 UTC 2019 on sn-devel-184 commit a2ab6485e027ebb13871c7d83b7626ac5c9b98c0 Author: Martin Schwenke <mar...@meltin.net> Date: Wed May 1 15:10:28 2019 +1000 ctdb-tests: Fix usage message Since commit 0e9ead8f28fced3ebfa888786a1dc5bb59e734a3 daemons have been shut down after each test, so this option no longer has anything to do with killing daemons. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 3cb53a7a05409925024d6a67bcfaeb962d896e0b Author: Martin Schwenke <mar...@meltin.net> Date: Sat Apr 27 14:54:09 2019 +1000 ctdb-tests: Wait to allow database attach/detach to take effect Sometimes the detach test fails: Check detaching single test database detach_test1.tdb BAD: database detach_test1.tdb is still attached Number of databases:4 dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.0/db/volatile/detach_test4.tdb.0 dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.0/db/volatile/detach_test3.tdb.0 dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.0/db/volatile/detach_test2.tdb.0 dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.0/db/volatile/detach_test1.tdb.0 Number of databases:3 dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.1/db/volatile/detach_test4.tdb.1 dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.1/db/volatile/detach_test3.tdb.1 dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.1/db/volatile/detach_test2.tdb.1 Number of databases:4 dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.2/db/volatile/detach_test4.tdb.2 dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.2/db/volatile/detach_test3.tdb.2 dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.2/db/volatile/detach_test2.tdb.2 dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.2/db/volatile/detach_test1.tdb.2 *** TEST COMPLETED (RC=1) AT 2019-04-27 03:35:40, CLEANING UP... When issued from a client, the detach control re-broadcasts itself asynchronously to all nodes and then returns success. The controls to some nodes to do the actual detach may still be in flight when success is returned to the client. Therefore, the test should wait for a few seconds to allow the asynchronous controls to complete. The same is true for the attach control, so workaround the problem in the attach test too. An alternative is to make the attach and detach controls synchronous by avoiding the broadcast and waiting for the results of the individual controls sent to the nodes. However, a simple implementation would involve adding new nested event loops. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 066cc5b0c561464ed08890d9aa1a1a55b545e9cc Author: Martin Schwenke <mar...@meltin.net> Date: Thu Apr 11 20:55:20 2019 +1000 ctdb-tests: Avoid bulk output in $out, prefer $outfile BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 9d02452a24625df5f62fd6d45a16effe2fa45fbe Author: Martin Schwenke <mar...@meltin.net> Date: Thu Mar 28 14:26:52 2019 +1100 ctdb-tests: Make try_command_on_node less error-prone This sometimes fails, apparently due to a cat process in onnode getting EAGAIN. The conclusion is that tests that process large amounts of output should not depend on a sub-shell delivering that output into a shell variable. Change try_command_on_node() to leave all of the output in file $outfile and just put the first 1KB into $out. $outfile is removed after each test completes. Change the implementation of sanity_check_output() to use $outfile instead of $out. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 7c3819d1ac264acf998f426e0cef7f6211e0ddee Author: Martin Schwenke <mar...@meltin.net> Date: Tue Apr 30 12:09:26 2019 +1000 ctdb-tests: Change sanity_check_output() to internally use $out All callers are currently passed $out. Global variable $out is used in many other places so use it here to simplify the interface and make future changes simpler. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924 Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit b80967f5dcc6b58db0c38ec3e5cf0cbe46dbeb4b Author: Martin Schwenke <mar...@meltin.net> Date: Fri Mar 29 11:19:55 2019 +1100 ctdb-scripts: Drop script configuration variable CTDB_MONITOR_SWAP_USAGE CTDB's system memory monitoring in 05.system.script monitors both main memory and swap. The swap monitoring was originally based on the (possibly incorrect, see below) idea that swap space stacks on top of main memory, so that when a system starts filling swap space then this is supposed to be a good sign that the system is running out of memory. Additionally, performance on a Linux system tends to be destroyed by the I/O associated with a lot of swapping to spinning disks. However, some platforms default to creating only 4GB of swap space even when there is 128GB of main memory. With such a small swap to main memory ratio, memory pressure can force swap to be nearly full even when a significant amount of main memory is still available and the system is performing well. This suggests that checking swap utilisation might be less than useful in many circumstances. So, remove the separate swap space checking and change the memory check to cover the total of main memory and swap space. Test function set_mem_usage() still takes an argument for each of main memory and swap space utilisation. For simplicity, the same number is now passed twice to make the intended results comprehensible. This could be changed later. A couple of tests are cleaned up to no longer use hard-coded /proc/meminfo and ps output. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 8108b3134c017c22d245fc5b2207a88d44ab0dd2 Author: Martin Schwenke <mar...@meltin.net> Date: Thu Apr 11 16:58:10 2019 +1000 ctdb-tests: Extend test to cover ctdb rddumpmemory BUG: https://bugzilla.samba.org/show_bug.cgi?id=13923 Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit f78d9388fb459dc83fafb4da6e683e3137ad40e1 Author: Martin Schwenke <mar...@meltin.net> Date: Thu Apr 11 16:56:32 2019 +1000 ctdb-tools: Fix ctdb dumpmemory to avoid printing trailing NUL Fix ctdb rddumpmemory too. BUG: https://bugzilla.samba.org/show_bug.cgi?id=13923 Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 95477e69e3e865cb4ee93f947074eef5c873750f Author: Martin Schwenke <mar...@meltin.net> Date: Fri Jan 18 17:46:37 2019 +1100 ctdb-daemon: Log when ctdbd CPU utilisation exceeds a threshold This is to help us notice when ctdbd is using the full capacity of a CPU, so is saturated. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 87032ccebdd13feef13d9da8d8958d928f36b75a Author: Martin Schwenke <mar...@meltin.net> Date: Fri Jan 18 17:43:44 2019 +1100 ctdb-build: Add check for getrusage() Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> ----------------------------------------------------------------------- Summary of changes: ctdb/config/events/legacy/05.system.script | 17 +-- ctdb/doc/ctdb-script.options.5.xml | 21 ---- ctdb/doc/examples/config_migrate.sh | 2 +- ctdb/server/ctdb_daemon.c | 123 +++++++++++++++++++++ ctdb/tests/complex/11_ctdb_delip_removes_ip.sh | 10 +- ctdb/tests/complex/18_ctdb_reloadips.sh | 8 +- ctdb/tests/complex/32_cifs_tickle.sh | 7 -- ctdb/tests/complex/36_smb_reset_server.sh | 12 +- ctdb/tests/complex/37_nfs_reset_server.sh | 4 +- ctdb/tests/complex/60_rogueip_releaseip.sh | 2 +- ctdb/tests/complex/scripts/local.bash | 5 +- ctdb/tests/eventscripts/05.system.monitor.011.sh | 3 +- ctdb/tests/eventscripts/05.system.monitor.012.sh | 3 +- ctdb/tests/eventscripts/05.system.monitor.013.sh | 21 ---- ctdb/tests/eventscripts/05.system.monitor.014.sh | 4 +- ctdb/tests/eventscripts/05.system.monitor.015.sh | 4 +- ctdb/tests/eventscripts/05.system.monitor.016.sh | 19 ---- ctdb/tests/eventscripts/05.system.monitor.017.sh | 30 +---- ctdb/tests/eventscripts/05.system.monitor.018.sh | 81 +++----------- ctdb/tests/run_tests.sh | 2 +- ctdb/tests/scripts/integration.bash | 71 ++++++------ ctdb/tests/simple/02_ctdb_tunables.sh | 6 +- ctdb/tests/simple/05_ctdb_listnodes.sh | 5 +- ctdb/tests/simple/08_ctdb_isnotrecmaster.sh | 10 +- ctdb/tests/simple/09_ctdb_ping.sh | 6 +- ctdb/tests/simple/11_ctdb_ip.sh | 14 ++- ctdb/tests/simple/12_ctdb_getdebug.sh | 3 +- ctdb/tests/simple/14_ctdb_statistics.sh | 2 +- ctdb/tests/simple/15_ctdb_statisticsreset.sh | 21 ++-- ctdb/tests/simple/19_ip_takeover_noop.sh | 4 +- ctdb/tests/simple/20_delip_iface_gc.sh | 10 +- ctdb/tests/simple/21_ctdb_attach.sh | 49 ++++---- ctdb/tests/simple/23_ctdb_moveip.sh | 25 ++++- ctdb/tests/simple/24_ctdb_getdbmap.sh | 10 +- ctdb/tests/simple/25_dumpmemory.sh | 9 +- ..._ctdb_config_check_error_on_unreachable_ctdb.sh | 6 +- ctdb/tests/simple/27_ctdb_detach.sh | 71 +++++++----- ctdb/tests/simple/35_ctdb_getreclock.sh | 2 +- ctdb/tests/simple/51_message_ring.sh | 14 +-- ctdb/tests/simple/52_fetch_ring.sh | 14 +-- ctdb/tests/simple/53_transaction_loop.sh | 4 +- ctdb/tests/simple/54_transaction_loop_recovery.sh | 4 +- ctdb/tests/simple/55_ctdb_ptrans.sh | 12 +- .../simple/56_replicated_transaction_recovery.sh | 4 +- ctdb/tests/simple/58_ctdb_restoredb.sh | 8 +- ctdb/tests/simple/69_recovery_resurrect_deleted.sh | 10 +- ctdb/tests/simple/70_recoverpdbbyseqnum.sh | 4 +- ctdb/tests/simple/71_ctdb_wipedb.sh | 4 +- ctdb/tests/simple/72_update_record_persistent.sh | 4 +- ctdb/tests/simple/75_readonly_records_basic.sh | 24 ++-- ctdb/tests/simple/77_ctdb_db_recovery.sh | 6 +- ctdb/tests/simple/79_volatile_db_traverse.sh | 4 +- ctdb/tests/simple/80_ctdb_traverse.sh | 2 +- ctdb/tests/simple/81_tunnel_ring.sh | 14 +-- ctdb/tests/simple/90_debug_hung_script.sh | 6 +- ctdb/tools/ctdb.c | 10 +- ctdb/wscript | 3 +- 57 files changed, 428 insertions(+), 425 deletions(-) delete mode 100755 ctdb/tests/eventscripts/05.system.monitor.013.sh delete mode 100755 ctdb/tests/eventscripts/05.system.monitor.016.sh Changeset truncated at 500 lines: diff --git a/ctdb/config/events/legacy/05.system.script b/ctdb/config/events/legacy/05.system.script index e2ffeac715a..08e401a9e73 100755 --- a/ctdb/config/events/legacy/05.system.script +++ b/ctdb/config/events/legacy/05.system.script @@ -132,9 +132,6 @@ monitor_memory_usage () if [ -z "$CTDB_MONITOR_MEMORY_USAGE" ] ; then CTDB_MONITOR_MEMORY_USAGE=80 fi - if [ -z "$CTDB_MONITOR_SWAP_USAGE" ] ; then - CTDB_MONITOR_SWAP_USAGE=25 - fi _meminfo=$(get_proc "meminfo") # Intentional word splitting here @@ -149,21 +146,19 @@ $1 == "SwapFree:" { swapfree = $2 } $1 == "SwapTotal:" { swaptotal = $2 } END { if (memavail != 0) { memfree = memavail ; } - if (memtotal != 0) { print int((memtotal - memfree) / memtotal * 100) ; } else { print 0 ; } - if (swaptotal != 0) { print int((swaptotal - swapfree) / swaptotal * 100) ; } else { print 0 ; } + if (memtotal + swaptotal != 0) { + usedtotal = memtotal - memfree + swaptotal - swapfree + print int(usedtotal / (memtotal + swaptotal) * 100) + } else { + print 0 + } }') _mem_usage="$1" - _swap_usage="$2" check_thresholds "System memory" \ "$CTDB_MONITOR_MEMORY_USAGE" \ "$_mem_usage" \ dump_memory_info - - check_thresholds "System swap" \ - "$CTDB_MONITOR_SWAP_USAGE" \ - "$_swap_usage" \ - dump_memory_info } diff --git a/ctdb/doc/ctdb-script.options.5.xml b/ctdb/doc/ctdb-script.options.5.xml index 9d545b5cc0d..6b2efb27ac2 100644 --- a/ctdb/doc/ctdb-script.options.5.xml +++ b/ctdb/doc/ctdb-script.options.5.xml @@ -964,27 +964,6 @@ CTDB_PER_IP_ROUTING_TABLE_ID_HIGH=9000 </listitem> </varlistentry> - <varlistentry> - <term> - CTDB_MONITOR_SWAP_USAGE=<parameter>SWAP-LIMITS</parameter> - </term> - <listitem> - <para> - SWAP-LIMITS takes the form - <parameter>WARN_LIMIT</parameter><optional>:<parameter>UNHEALTHY_LIMIT</parameter></optional> - indicating that warnings should be logged if - swap usage reaches WARN_LIMIT%. If usage reaches - UNHEALTHY_LIMIT then the node should be flagged - unhealthy. Either WARN_LIMIT or UNHEALTHY_LIMIT may be - left blank, meaning that check will be omitted. - </para> - <para> - Default is 25, so warnings will be logged when swap - usage reaches 25%. - </para> - </listitem> - </varlistentry> - </variablelist> </refsect2> diff --git a/ctdb/doc/examples/config_migrate.sh b/ctdb/doc/examples/config_migrate.sh index 8479aeb39f3..e0d01e77057 100755 --- a/ctdb/doc/examples/config_migrate.sh +++ b/ctdb/doc/examples/config_migrate.sh @@ -209,6 +209,7 @@ CTDB_NOTIFY_SCRIPT CTDB_PUBLIC_INTERFACE CTDB_MAX_PERSISTENT_CHECK_ERRORS CTDB_SHUTDOWN_TIMEOUT +CTDB_MONITOR_SWAP_USAGE EOF } @@ -262,7 +263,6 @@ CTDB_MAX_CORRUPT_DB_BACKUPS # 05.system CTDB_MONITOR_FILESYSTEM_USAGE CTDB_MONITOR_MEMORY_USAGE -CTDB_MONITOR_SWAP_USAGE # debug_hung_scripts.sh CTDB_DEBUG_HUNG_SCRIPT_STACKPAT EOF diff --git a/ctdb/server/ctdb_daemon.c b/ctdb/server/ctdb_daemon.c index a8691388d4a..c5733bb2592 100644 --- a/ctdb/server/ctdb_daemon.c +++ b/ctdb/server/ctdb_daemon.c @@ -72,7 +72,126 @@ static void print_exit_message(void) } } +#ifdef HAVE_GETRUSAGE +struct cpu_check_threshold_data { + unsigned short percent; + struct timeval timeofday; + struct timeval ru_time; +}; + +static void ctdb_cpu_check_threshold(struct tevent_context *ev, + struct tevent_timer *te, + struct timeval tv, + void *private_data) +{ + struct ctdb_context *ctdb = talloc_get_type_abort( + private_data, struct ctdb_context); + uint32_t interval = 60; + + static unsigned short threshold = 0; + static struct cpu_check_threshold_data prev = { + .percent = 0, + .timeofday = { .tv_sec = 0 }, + .ru_time = { .tv_sec = 0 }, + }; + + struct rusage usage; + struct cpu_check_threshold_data curr = { + .percent = 0, + }; + int64_t ru_time_diff, timeofday_diff; + bool first; + int ret; + + /* + * Cache the threshold so that we don't waste time checking + * the environment variable every time + */ + if (threshold == 0) { + const char *t; + + threshold = 90; + + t = getenv("CTDB_TEST_CPU_USAGE_THRESHOLD"); + if (t != NULL) { + int th; + + th = atoi(t); + if (th <= 0 || th > 100) { + DBG_WARNING("Failed to parse env var: %s\n", t); + } else { + threshold = th; + } + } + } + + ret = getrusage(RUSAGE_SELF, &usage); + if (ret != 0) { + DBG_WARNING("rusage() failed: %d\n", ret); + goto next; + } + + /* Sum the system and user CPU usage */ + curr.ru_time = timeval_sum(&usage.ru_utime, &usage.ru_stime); + + curr.timeofday = tv; + + first = timeval_is_zero(&prev.timeofday); + if (first) { + /* No previous values recorded so no calculation to do */ + goto done; + } + + timeofday_diff = usec_time_diff(&curr.timeofday, &prev.timeofday); + if (timeofday_diff <= 0) { + /* + * Time went backwards or didn't progress so no (sane) + * calculation can be done + */ + goto done; + } + + ru_time_diff = usec_time_diff(&curr.ru_time, &prev.ru_time); + + curr.percent = ru_time_diff * 100 / timeofday_diff; + + if (curr.percent >= threshold) { + /* Log only if the utilisation changes */ + if (curr.percent != prev.percent) { + D_WARNING("WARNING: CPU utilisation %hu%% >= " + "threshold (%hu%%)\n", + curr.percent, + threshold); + } + } else { + /* Log if the utilisation falls below the threshold */ + if (prev.percent >= threshold) { + D_WARNING("WARNING: CPU utilisation %hu%% < " + "threshold (%hu%%)\n", + curr.percent, + threshold); + } + } + +done: + prev = curr; + +next: + tevent_add_timer(ctdb->ev, ctdb, + timeval_current_ofs(interval, 0), + ctdb_cpu_check_threshold, + ctdb); +} + +static void ctdb_start_cpu_check_threshold(struct ctdb_context *ctdb) +{ + tevent_add_timer(ctdb->ev, ctdb, + timeval_current(), + ctdb_cpu_check_threshold, + ctdb); +} +#endif /* HAVE_GETRUSAGE */ static void ctdb_time_tick(struct tevent_context *ev, struct tevent_timer *te, struct timeval t, void *private_data) @@ -111,6 +230,10 @@ static void ctdb_start_periodic_events(struct ctdb_context *ctdb) /* start listening to timer ticks */ ctdb_start_time_tickd(ctdb); + +#ifdef HAVE_GETRUSAGE + ctdb_start_cpu_check_threshold(ctdb); +#endif /* HAVE_GETRUSAGE */ } static void ignore_signal(int signum) diff --git a/ctdb/tests/complex/11_ctdb_delip_removes_ip.sh b/ctdb/tests/complex/11_ctdb_delip_removes_ip.sh index b5c8866d67a..543472c0f22 100755 --- a/ctdb/tests/complex/11_ctdb_delip_removes_ip.sh +++ b/ctdb/tests/complex/11_ctdb_delip_removes_ip.sh @@ -22,8 +22,8 @@ cluster_is_healthy select_test_node_and_ips get_test_ip_mask_and_iface -echo "Checking that node ${test_node} hosts ${test_ip} on interface ${iface}..." -try_command_on_node $test_node "ip addr show dev $iface | grep -E 'inet6?[[:space:]]*${test_ip}/'" +echo "Checking that node ${test_node} hosts ${test_ip}..." +try_command_on_node $test_node "ip addr show to ${test_ip} | grep -q ." echo "Attempting to remove ${test_ip} from node ${test_node}." try_command_on_node $test_node $CTDB delip $test_ip @@ -33,10 +33,10 @@ wait_until_ips_are_on_node '!' $test_node $test_ip timeout=60 increment=5 count=0 -echo "Waiting for ${test_ip} to disappear from ${iface}..." +echo "Waiting for ${test_ip} to disappear from node ${test_node}..." while : ; do - try_command_on_node -v $test_node "ip addr show dev $iface" - if echo "$out" | grep -E 'inet6?[[:space:]]*${test_ip}/'; then + try_command_on_node -v $test_node "ip addr show to ${test_node}" + if -n "$out" ; then echo "Still there..." if [ $(($count * $increment)) -ge $timeout ] ; then echo "BAD: Timed out waiting..." diff --git a/ctdb/tests/complex/18_ctdb_reloadips.sh b/ctdb/tests/complex/18_ctdb_reloadips.sh index 2beff771874..4ba1b26a8e8 100755 --- a/ctdb/tests/complex/18_ctdb_reloadips.sh +++ b/ctdb/tests/complex/18_ctdb_reloadips.sh @@ -48,12 +48,12 @@ select_test_node_and_ips echo "Getting public IP information from CTDB..." try_command_on_node any "$CTDB ip -X -v all" -ctdb_ip_info=$(echo "$out" | awk -F'|' 'NR > 1 { print $2, $3, $5 }') +ctdb_ip_info=$(awk -F'|' 'NR > 1 { print $2, $3, $5 }' "$outfile") echo "Getting IP information from interfaces..." try_command_on_node all "ip addr show" -ip_addr_info=$(echo "$out" | \ - awk '$1 == "inet" { ip = $2; sub(/\/.*/, "", ip); print ip }') +ip_addr_info=$(awk '$1 == "inet" { ip = $2; sub(/\/.*/, "", ip); print ip }' \ + "$outfile") prefix="" for b in $(seq 0 255) ; do @@ -168,7 +168,7 @@ check_ips () try_command_on_node $test_node "ip addr show dev ${iface}" local ip_addrs_file=$(mktemp) - echo "$out" | \ + cat "$outfile" | \ sed -n -e "s@.*inet * \(${prefix//./\.}\.[0-9]*\)/.*@\1@p" | \ sort >"$ip_addrs_file" diff --git a/ctdb/tests/complex/32_cifs_tickle.sh b/ctdb/tests/complex/32_cifs_tickle.sh index 4f2cdadbdfc..bfe3df4e82f 100755 --- a/ctdb/tests/complex/32_cifs_tickle.sh +++ b/ctdb/tests/complex/32_cifs_tickle.sh @@ -61,13 +61,6 @@ echo "Source socket is $src_socket" # we sometimes beat the registration. echo "Checking if CIFS connection is tracked by CTDB on test node..." wait_until 10 check_tickles $test_node $test_ip $test_port $src_socket -echo "$out" - -if [ "${out/SRC: ${src_socket} /}" != "$out" ] ; then - echo "GOOD: CIFS connection tracked OK by CTDB." -else - die "BAD: Socket not tracked by CTDB." -fi # This is almost immediate. However, it is sent between nodes # asynchonously, so it is worth checking... diff --git a/ctdb/tests/complex/36_smb_reset_server.sh b/ctdb/tests/complex/36_smb_reset_server.sh index 0de77722fc3..870b80661aa 100755 --- a/ctdb/tests/complex/36_smb_reset_server.sh +++ b/ctdb/tests/complex/36_smb_reset_server.sh @@ -59,16 +59,8 @@ echo "Source socket is $src_socket" # This should happen as soon as connection is up... but unless we wait # we sometimes beat the registration. -echo "Checking if CIFS connection is tracked by CTDB on test node..." +echo "Waiting until SMB connection is tracked by CTDB on test node..." wait_until 10 check_tickles $test_node $test_ip $test_port $src_socket -echo "$out" - -if [ "${out/SRC: ${src_socket} /}" != "$out" ] ; then - echo "GOOD: CIFS connection tracked OK by CTDB." -else - echo "BAD: Socket not tracked by CTDB." - exit 1 -fi # It would be nice if ss consistently used local/peer instead of src/dst ss_filter="src ${test_ip}:${test_port} dst ${src_socket}" @@ -80,7 +72,7 @@ if [ -z "$out" ] ; then exit 1 fi echo "GOOD: ss lists the socket:" -echo "$out" +cat "$outfile" echo "Disabling node $test_node" try_command_on_node 1 $CTDB disable -n $test_node diff --git a/ctdb/tests/complex/37_nfs_reset_server.sh b/ctdb/tests/complex/37_nfs_reset_server.sh index 7190af0f552..32ff9295cc6 100755 --- a/ctdb/tests/complex/37_nfs_reset_server.sh +++ b/ctdb/tests/complex/37_nfs_reset_server.sh @@ -60,7 +60,7 @@ echo "Source socket is $src_socket" echo "Wait until NFS connection is tracked by CTDB on test node ..." wait_until $((monitor_interval * 2)) \ check_tickles $test_node $test_ip $test_port $src_socket -echo "$out" +cat "$outfile" # It would be nice if ss consistently used local/peer instead of src/dst ss_filter="src ${test_ip}:${test_port} dst ${src_socket}" @@ -72,7 +72,7 @@ if [ -z "$out" ] ; then exit 1 fi echo "GOOD: ss lists the socket:" -echo "$out" +cat "$outfile" echo "Disabling node $test_node" try_command_on_node 1 $CTDB disable -n $test_node diff --git a/ctdb/tests/complex/60_rogueip_releaseip.sh b/ctdb/tests/complex/60_rogueip_releaseip.sh index 2fddc06f867..88e4e554c34 100755 --- a/ctdb/tests/complex/60_rogueip_releaseip.sh +++ b/ctdb/tests/complex/60_rogueip_releaseip.sh @@ -31,7 +31,7 @@ for i in $all_pnns ; do continue fi try_command_on_node $i "$CTDB ip" - n=$(awk -v ip="$test_ip" '$1 == ip { print }' <<<"$out") + n=$(awk -v ip="$test_ip" '$1 == ip { print }' "$outfile") if [ -n "$n" ] ; then other_node="$i" break diff --git a/ctdb/tests/complex/scripts/local.bash b/ctdb/tests/complex/scripts/local.bash index 7787de8f111..787f597edcc 100644 --- a/ctdb/tests/complex/scripts/local.bash +++ b/ctdb/tests/complex/scripts/local.bash @@ -67,7 +67,7 @@ check_tickles () local src_socket="$4" try_command_on_node $node ctdb gettickles $test_ip $test_port # SRC: 10.0.2.45:49091 DST: 10.0.2.143:445 - [ "${out/SRC: ${src_socket} /}" != "$out" ] + grep -Fq "SRC: ${src_socket} " "$outfile" } check_tickles_all () @@ -79,8 +79,7 @@ check_tickles_all () try_command_on_node all ctdb gettickles $test_ip $test_port # SRC: 10.0.2.45:49091 DST: 10.0.2.143:445 - local t="${src_socket//./\\.}" - local count=$(grep -E -c "SRC: ${t} " <<<"$out" || true) + local count=$(grep -Fc "SRC: ${src_socket} " "$outfile" || true) [ $count -eq $numnodes ] } diff --git a/ctdb/tests/eventscripts/05.system.monitor.011.sh b/ctdb/tests/eventscripts/05.system.monitor.011.sh index a7d2e99c2b7..6cd1dabbb37 100755 --- a/ctdb/tests/eventscripts/05.system.monitor.011.sh +++ b/ctdb/tests/eventscripts/05.system.monitor.011.sh @@ -2,13 +2,12 @@ . "${TEST_SCRIPTS_DIR}/unit.sh" -define_test "Memory check, bad situation, default checks enabled" +define_test "Memory check (default), warning situation" setup set_mem_usage 100 100 ok <<EOF WARNING: System memory utilization 100% >= threshold 80% -WARNING: System swap utilization 100% >= threshold 25% EOF simple_test diff --git a/ctdb/tests/eventscripts/05.system.monitor.012.sh b/ctdb/tests/eventscripts/05.system.monitor.012.sh index bc517081e42..9e840564f49 100755 --- a/ctdb/tests/eventscripts/05.system.monitor.012.sh +++ b/ctdb/tests/eventscripts/05.system.monitor.012.sh @@ -2,13 +2,12 @@ . "${TEST_SCRIPTS_DIR}/unit.sh" -define_test "Memory check, good situation, all memory checks enabled" +define_test "Memory check (custom, both), good situation" setup setup_script_options <<EOF CTDB_MONITOR_MEMORY_USAGE="80:90" -CTDB_MONITOR_SWAP_USAGE="1:50" EOF ok_null diff --git a/ctdb/tests/eventscripts/05.system.monitor.013.sh b/ctdb/tests/eventscripts/05.system.monitor.013.sh deleted file mode 100755 index f4ea7ded6d0..00000000000 --- a/ctdb/tests/eventscripts/05.system.monitor.013.sh +++ /dev/null @@ -1,21 +0,0 @@ -#!/bin/sh - -. "${TEST_SCRIPTS_DIR}/unit.sh" - -define_test "Memory check, bad situation, custom swap critical" - -setup - -setup_script_options <<EOF -CTDB_MONITOR_SWAP_USAGE=":50" -EOF - -set_mem_usage 100 90 -required_result 1 <<EOF -WARNING: System memory utilization 100% >= threshold 80% -ERROR: System swap utilization 90% >= threshold 50% -$FAKE_PROC_MEMINFO -$(ps foobar) -EOF - -simple_test diff --git a/ctdb/tests/eventscripts/05.system.monitor.014.sh b/ctdb/tests/eventscripts/05.system.monitor.014.sh index 1b6d2155272..9e2b21c9822 100755 --- a/ctdb/tests/eventscripts/05.system.monitor.014.sh +++ b/ctdb/tests/eventscripts/05.system.monitor.014.sh @@ -2,7 +2,7 @@ . "${TEST_SCRIPTS_DIR}/unit.sh" -define_test "Memory check, bad memory situation, custom memory warning" +define_test "Memory check (custom, warning only), warning situation" setup @@ -10,7 +10,7 @@ setup_script_options <<EOF CTDB_MONITOR_MEMORY_USAGE="85:" EOF -set_mem_usage 90 10 +set_mem_usage 90 90 ok <<EOF WARNING: System memory utilization 90% >= threshold 85% EOF diff --git a/ctdb/tests/eventscripts/05.system.monitor.015.sh b/ctdb/tests/eventscripts/05.system.monitor.015.sh index 3f1fe9bfc46..0091c429ac1 100755 --- a/ctdb/tests/eventscripts/05.system.monitor.015.sh +++ b/ctdb/tests/eventscripts/05.system.monitor.015.sh @@ -2,7 +2,7 @@ . "${TEST_SCRIPTS_DIR}/unit.sh" -define_test "Memory check, bad situation, custom memory critical" +define_test "Memory check (custom, error only), error situation" setup -- Samba Shared Repository