The branch, master has been updated
       via  4164d7b ctdb-scripts: Add default filesystem usage warnings
       via  0f28ccf ctdb-scripts: Add default system memory usage warnings
       via  2c601f1 ctdb-scripts: Enable system monitoring eventscript by 
default
       via  b18e4ae ctdb-scripts: Throttle system resource monitoring warnings
       via  e6b5163 ctdb-scripts: Don't shutdown CTDB when memory monitoring 
fails
       via  b6a0e4b ctdb-scripts: New consistent system memory and swap 
monitoring
       via  02fa6c3 ctdb-scripts: Factor out new function check_thresholds()
       via  b7b6e25 ctdb-scripts: Memory monitoring uses thresholds expressed 
as percentages
       via  bd2845d ctdb-scripts: Use MemAvailable if it is in /proc/meminfo
       via  99b8ef5 ctdb-scripts: Only use /proc/meminfo for memory checks, not 
"free"
       via  ab58c7a ctdb-scripts: Move system memory checking to 05.system
       via  b27ff25 ctdb-tests: Remove unwanted trailing whitespace
       via  23acbd2 ctdb-tests: Add tests for filesystem usage monitoring
       via  fa10506 ctdb-scripts: New configuration variable 
CTDB_MONITOR_FILESYSTEM_USAGE
       via  8f713c8 ctdb-scripts: Don't fail monitoring if sanity checks fail
       via  6b4a46e ctdb-scripts: Move filesystem monitoring into a function, 
clean it up
       via  47f7d1b ctdb-scripts: Rename 40.fs_use to 05.system
      from  e139f19 s3: add suport for SMB3_10 and SMB3_11 protocols in 
smbstatus

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 4164d7bf3153a2fd9081b4d073bfa88fec1507ad
Author: Martin Schwenke <mar...@meltin.net>
Date:   Tue Aug 18 15:22:23 2015 +1000

    ctdb-scripts: Add default filesystem usage warnings
    
    Always check filesystem usage for the database directories.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>
    
    Autobuild-User(master): Amitay Isaacs <ami...@samba.org>
    Autobuild-Date(master): Sat Aug 29 20:08:48 CEST 2015 on sn-devel-104

commit 0f28ccf87af4e90867eaab213a640f6d0cdaa12d
Author: Martin Schwenke <mar...@meltin.net>
Date:   Fri Aug 14 17:08:45 2015 +1000

    ctdb-scripts: Add default system memory usage warnings
    
    CTDB should warn by default if too much system memory or swap is used.
    
    The tests have also been tweaked.  In particular, the filesystem-only
    tests need to initialise the memory information to avoid errors where
    meminfo isn't set.
    
    Document the defaults, warning against disabling them.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit 2c601f189521ae65ec5ab867c6d8c88cb5d1ae8c
Author: Martin Schwenke <mar...@meltin.net>
Date:   Thu Aug 6 15:59:06 2015 +1000

    ctdb-scripts: Enable system monitoring eventscript by default
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit b18e4ae0c9536a549722aeef8bc6c095b12db962
Author: Martin Schwenke <mar...@meltin.net>
Date:   Wed Aug 5 20:42:16 2015 +1000

    ctdb-scripts: Throttle system resource monitoring warnings
    
    They are only printed when the percentage usage changes.  This should
    stop the logs from being filled with warnings.
    
    Add a test for the throttling.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit e6b5163bc1c3551a808d3741b4cbac80e15d10d9
Author: Martin Schwenke <mar...@meltin.net>
Date:   Mon Aug 3 19:55:27 2015 +1000

    ctdb-scripts: Don't shutdown CTDB when memory monitoring fails
    
    Marking the node unhealthy should cause Samba processes to close,
    possible freeing a stack of memory.  If not, then it is somebody
    else's problem.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit b6a0e4b85699241ba90f25f4c605cbb7a6fc2146
Author: Martin Schwenke <mar...@meltin.net>
Date:   Mon Aug 3 17:22:08 2015 +1000

    ctdb-scripts: New consistent system memory and swap monitoring
    
    New variables CTDB_MONITOR_MEMORY_USAGE and CTDB_MONITOR_SWAP_USAGE.
    Both take a pair of <warn_threshold>:<unhealthy_threshold> where each
    theshold is specified as a percentage.
    
    This adds a callout to check_thresholds() that is run when the
    unhealthy threshold is reached.
    
    Add some combination tests.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit 02fa6c3d106e8fbf0e685afafa5e6a9bc0c3d22d
Author: Martin Schwenke <mar...@meltin.net>
Date:   Mon Aug 3 16:20:40 2015 +1000

    ctdb-scripts: Factor out new function check_thresholds()
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit b7b6e25b3e26210ed196be7fc5848e3320b5c35b
Author: Martin Schwenke <mar...@meltin.net>
Date:   Mon Aug 3 15:59:50 2015 +1000

    ctdb-scripts: Memory monitoring uses thresholds expressed as percentages
    
    CTDB_MONITOR_FREE_MEMORY and CTDB_MONITOR_FREE_MEMORY_WARN are now
    percentages that specify thresholds of acceptable memory usage.
    
    Memory/swap usage in tests also specified as percentages.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit bd2845d7ebe9e2970d4d5546e51c79c9b40ce9cb
Author: Martin Schwenke <mar...@meltin.net>
Date:   Fri Jul 24 19:57:42 2015 +1000

    ctdb-scripts: Use MemAvailable if it is in /proc/meminfo
    
    Otherwise calculate, as before.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit 99b8ef512162570504689b53adb14a52233f49b7
Author: Martin Schwenke <mar...@meltin.net>
Date:   Mon Jul 20 20:50:56 2015 +1000

    ctdb-scripts: Only use /proc/meminfo for memory checks, not "free"
    
    No need to use 2 different sources of information for similar checks.
    Also, output of free has been changed, whereas /proc/meminfo is a
    kernel API, which will not change.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit ab58c7abd9c49325c3cee1e7178d04a3034e57d8
Author: Martin Schwenke <mar...@meltin.net>
Date:   Mon Jul 20 16:08:13 2015 +1000

    ctdb-scripts: Move system memory checking to 05.system
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit b27ff251aff6d7c5c59dbe9b1748b30587402aa3
Author: Martin Schwenke <mar...@meltin.net>
Date:   Thu Aug 20 11:47:19 2015 +1000

    ctdb-tests: Remove unwanted trailing whitespace
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit 23acbd2f4b0079d1fab01a7dad135e3451efd6d7
Author: Martin Schwenke <mar...@meltin.net>
Date:   Fri Jul 17 21:32:01 2015 +1000

    ctdb-tests: Add tests for filesystem usage monitoring
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit fa1050690bd28cac8bc99047a900caf2e5fca22f
Author: Martin Schwenke <mar...@meltin.net>
Date:   Mon Aug 3 14:56:40 2015 +1000

    ctdb-scripts: New configuration variable CTDB_MONITOR_FILESYSTEM_USAGE
    
    This allows both errors (i.e. unhealthy) and warnings for different
    thresholds.  It replaces CTDB_CHECK_FS_USE.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit 8f713c87c1359ef8780018718f6fa47bb0fa82a7
Author: Martin Schwenke <mar...@meltin.net>
Date:   Fri Jul 24 19:56:06 2015 +1000

    ctdb-scripts: Don't fail monitoring if sanity checks fail
    
    Just log some warnings.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit 6b4a46e5742732d7cbdf911b74ab0bb1fc8e3b97
Author: Martin Schwenke <mar...@meltin.net>
Date:   Fri Jul 17 20:04:44 2015 +1000

    ctdb-scripts: Move filesystem monitoring into a function, clean it up
    
    Drop obvious comments.  Use die() for less lines of code.  Use a case
    statement to avoid forking unnecessary processes for each filesystem
    being checked.  Drop parentheses around percentages in messages.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

commit 47f7d1b1c8432ffdfb71176cf64cdd31e188e59c
Author: Martin Schwenke <mar...@meltin.net>
Date:   Fri Jul 17 11:59:56 2015 +1000

    ctdb-scripts: Rename 40.fs_use to 05.system
    
    Will put all the system monitoring in here, simplifying 00.ctdb.
    
    Signed-off-by: Martin Schwenke <mar...@meltin.net>
    Reviewed-by: Amitay Isaacs <ami...@gmail.com>

-----------------------------------------------------------------------

Summary of changes:
 ctdb/config/events.d/00.ctdb                     |  43 ------
 ctdb/config/events.d/05.system                   | 176 +++++++++++++++++++++++
 ctdb/config/events.d/40.fs_use                   |  55 -------
 ctdb/doc/ctdbd.conf.5.xml                        |  92 ++++++------
 ctdb/packaging/RPM/ctdb.spec.in                  |   2 +-
 ctdb/tests/eventscripts/00.ctdb.monitor.001.sh   |  15 --
 ctdb/tests/eventscripts/00.ctdb.monitor.002.sh   |  15 --
 ctdb/tests/eventscripts/00.ctdb.monitor.003.sh   |  19 ---
 ctdb/tests/eventscripts/00.ctdb.monitor.004.sh   |  17 ---
 ctdb/tests/eventscripts/00.ctdb.monitor.005.sh   |  21 ---
 ctdb/tests/eventscripts/05.system.monitor.001.sh |  14 ++
 ctdb/tests/eventscripts/05.system.monitor.002.sh |  12 ++
 ctdb/tests/eventscripts/05.system.monitor.003.sh |  14 ++
 ctdb/tests/eventscripts/05.system.monitor.004.sh |  12 ++
 ctdb/tests/eventscripts/05.system.monitor.005.sh |  14 ++
 ctdb/tests/eventscripts/05.system.monitor.006.sh |  14 ++
 ctdb/tests/eventscripts/05.system.monitor.007.sh |  12 ++
 ctdb/tests/eventscripts/05.system.monitor.011.sh |  16 +++
 ctdb/tests/eventscripts/05.system.monitor.012.sh |  14 ++
 ctdb/tests/eventscripts/05.system.monitor.013.sh |  19 +++
 ctdb/tests/eventscripts/05.system.monitor.014.sh |  16 +++
 ctdb/tests/eventscripts/05.system.monitor.015.sh |  18 +++
 ctdb/tests/eventscripts/05.system.monitor.016.sh |  16 +++
 ctdb/tests/eventscripts/05.system.monitor.017.sh |  40 ++++++
 ctdb/tests/eventscripts/05.system.monitor.018.sh | 123 ++++++++++++++++
 ctdb/tests/eventscripts/scripts/local.sh         |  60 +++++---
 ctdb/tests/eventscripts/stubs/df                 |  38 +++++
 ctdb/tests/eventscripts/stubs/free               |   9 --
 ctdb/tests/eventscripts/stubs/ps                 |   2 +-
 29 files changed, 653 insertions(+), 265 deletions(-)
 create mode 100755 ctdb/config/events.d/05.system
 delete mode 100644 ctdb/config/events.d/40.fs_use
 delete mode 100755 ctdb/tests/eventscripts/00.ctdb.monitor.001.sh
 delete mode 100755 ctdb/tests/eventscripts/00.ctdb.monitor.002.sh
 delete mode 100755 ctdb/tests/eventscripts/00.ctdb.monitor.003.sh
 delete mode 100755 ctdb/tests/eventscripts/00.ctdb.monitor.004.sh
 delete mode 100755 ctdb/tests/eventscripts/00.ctdb.monitor.005.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.001.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.002.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.003.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.004.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.005.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.006.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.007.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.011.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.012.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.013.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.014.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.015.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.016.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.017.sh
 create mode 100755 ctdb/tests/eventscripts/05.system.monitor.018.sh
 create mode 100755 ctdb/tests/eventscripts/stubs/df
 delete mode 100755 ctdb/tests/eventscripts/stubs/free


Changeset truncated at 500 lines:

diff --git a/ctdb/config/events.d/00.ctdb b/ctdb/config/events.d/00.ctdb
index 0e25e50..da7186f 100755
--- a/ctdb/config/events.d/00.ctdb
+++ b/ctdb/config/events.d/00.ctdb
@@ -116,46 +116,6 @@ set_ctdb_variables ()
     done
 }
 
-monitor_system_memory ()
-{
-    # If monitoring free memory then calculate how much there is
-    if [ -n "$CTDB_MONITOR_FREE_MEMORY_WARN" -o \
-       -n "$CTDB_MONITOR_FREE_MEMORY" ] ; then
-       free_mem=$(free -m | awk '$2 == "buffers/cache:" { print $4 }')
-    fi
-
-    # Shutdown CTDB when memory is below the configured limit
-    if [ -n "$CTDB_MONITOR_FREE_MEMORY" ] ; then
-       if [ $free_mem -le $CTDB_MONITOR_FREE_MEMORY ] ; then
-           echo "CRITICAL: OOM - ${free_mem}MB free <= 
${CTDB_MONITOR_FREE_MEMORY}MB (CTDB threshold)"
-           echo "CRITICAL: Shutting down CTDB!!!"
-           get_proc "meminfo"
-           ps auxfww
-           set_proc "sysrq-trigger" "m"
-           ctdb disable
-           sleep 3
-           ctdb shutdown
-       fi
-    fi
-
-    # Warn when low on memory
-    if [ -n "$CTDB_MONITOR_FREE_MEMORY_WARN" ] ; then
-       if [ $free_mem -le $CTDB_MONITOR_FREE_MEMORY_WARN ] ; then
-           echo "WARNING: free memory is low - ${free_mem}MB free <=  
${CTDB_MONITOR_FREE_MEMORY_WARN}MB (CTDB threshold)"
-       fi
-    fi
-
-    # We should never enter swap, so SwapTotal == SwapFree.
-    if [ "$CTDB_CHECK_SWAP_IS_NOT_USED" = "yes" ] ; then
-       set -- $(get_proc "meminfo" | awk '$1 ~ /Swap(Total|Free):/ { print $2 
}')
-       if [ "$1" != "$2" ] ; then
-           echo We are swapping:
-           get_proc "meminfo"
-           ps auxfww
-       fi
-    fi
-}
-
 ############################################################
 
 ctdb_check_args "$@"
@@ -187,9 +147,6 @@ case "$1" in
     startup)
        ctdb attach ctdb.tdb persistent
        ;;
-    monitor)
-       monitor_system_memory
-       ;;
 
     *)
        ctdb_standard_event_handler "$@"
diff --git a/ctdb/config/events.d/05.system b/ctdb/config/events.d/05.system
new file mode 100755
index 0000000..69fcec2
--- /dev/null
+++ b/ctdb/config/events.d/05.system
@@ -0,0 +1,176 @@
+#!/bin/sh
+# ctdb event script for checking local file system utilization
+
+[ -n "$CTDB_BASE" ] || \
+    export CTDB_BASE=$(cd -P $(dirname "$0") ; dirname "$PWD")
+
+. $CTDB_BASE/functions
+loadconfig
+
+ctdb_setup_service_state_dir "system-monitoring"
+
+validate_percentage ()
+{
+    case "$1" in
+       "") return 1 ;;  # A failure that doesn't need a warning
+       [0-9]|[0-9][0-9]|100) return 0 ;;
+       *) echo "WARNING: ${1} is an invalid percentage${2:+ in \"}${2}${2:+\"} 
check"
+          return 1
+    esac
+}
+
+check_thresholds ()
+{
+    _thing="$1"
+    _thresholds="$2"
+    _usage="$3"
+    _unhealthy_callout="$4"
+
+    case "$_thresholds" in
+       *:*)
+           _warn_threshold="${_thresholds%:*}"
+           _unhealthy_threshold="${_thresholds#*:}"
+           ;;
+       *)
+           _warn_threshold="$_thresholds"
+           _unhealthy_threshold=""
+    esac
+
+    _t=$(echo "$_thing" | sed -e 's@/@SLASH_@g' -e 's@ @_@g')
+    _cache="${service_state_dir}/cache_${_t}"
+    if validate_percentage "$_unhealthy_threshold" "$_thing" ; then
+        if [ "$_usage" -ge "$_unhealthy_threshold" ] ; then
+           echo "ERROR: ${_thing} utilization ${_usage}% >= threshold 
${_unhealthy_threshold}%"
+           eval "$_unhealthy_callout"
+           echo "$_usage" >"$_cache"
+           exit 1
+        fi
+    fi
+
+    if validate_percentage "$_warn_threshold" "$_what" ; then
+        if [ "$_usage" -ge "$_warn_threshold" ] ; then
+           if [ -r "$_cache" ] ; then
+               read _prev <"$_cache"
+           else
+               _prev=""
+           fi
+           if [ "$_usage" != "$_prev" ] ; then
+               echo "WARNING: ${_thing} utilization ${_usage}% >= threshold 
${_warn_threshold}%"
+               echo "$_usage" >"$_cache"
+           fi
+       else
+           if [ -r "$_cache" ] ; then
+               echo "NOTICE: ${_thing} utilization ${_usage}% < threshold 
${_warn_threshold}%"
+           fi
+           rm -f "$_cache"
+        fi
+    fi
+}
+
+set_monitor_filsystem_usage_defaults ()
+{
+    
_fs_defaults_cache="${service_state_dir}/cache_monitor_filsystem_usage_defaults"
+
+    if [ ! -r "$_fs_defaults_cache" ] ; then
+       # Determine filesystem for each database directory, generate
+       # an entry to warn at 90%, de-duplicate entries, put all items
+       # on 1 line (so the read below gets everything)
+       for _t in "${CTDB_DBDIR:-${CTDB_VARDIR}}" \
+                     "${CTDB_DBDIR_PERSISTENT:-${CTDB_VARDIR}/persistent}" \
+                     "${CTDB_DBDIR_STATE:-${CTDB_VARDIR}/state}" ; do
+           df -kP "$_t" | awk 'NR == 2 { printf "%s:90\n", $6 }'
+       done | sort -u | xargs >"$_fs_defaults_cache"
+    fi
+
+    read CTDB_MONITOR_FILESYSTEM_USAGE <"$_fs_defaults_cache"
+}
+
+monitor_filesystem_usage ()
+{
+    if [ -z "$CTDB_MONITOR_FILESYSTEM_USAGE" ] ; then
+       set_monitor_filsystem_usage_defaults
+    fi
+
+    # Check each specified filesystem, specified in format
+    # <fs_mount>:<fs_warn_threshold>[:fs_unhealthy_threshold]
+    for _fs in $CTDB_MONITOR_FILESYSTEM_USAGE ; do
+       _fs_mount="${_fs%%:*}"
+       _fs_thresholds="${_fs#*:}"
+
+        if [ ! -d "$_fs_mount" ]; then
+            echo "WARNING: Directory ${_fs_mount} does not exist"
+           continue
+        fi
+
+        # Get current utilization
+        _fs_usage=$(df -kP "$_fs_mount" | \
+                          sed -n -e 's@.*[[:space:]]\([[:digit:]]*\)%.*@\1@p')
+        if [ -z "$_fs_usage" ] ; then
+            echo "WARNING: Unable to get FS utilization for ${_fs_mount}"
+           continue
+        fi
+
+       check_thresholds "Filesystem ${_fs_mount}" \
+                        "$_fs_thresholds" \
+                        "$_fs_usage"
+    done
+}
+
+dump_memory_info ()
+{
+    get_proc "meminfo"
+    ps auxfww
+    set_proc "sysrq-trigger" "m"
+}
+
+monitor_memory_usage ()
+{
+    # Defaults
+    if [ -z "$CTDB_MONITOR_MEMORY_USAGE" ] ; then
+       CTDB_MONITOR_MEMORY_USAGE=80
+    fi
+    if [ -z "$CTDB_MONITOR_SWAP_USAGE" ] ; then
+       CTDB_MONITOR_SWAP_USAGE=25
+    fi
+
+    _meminfo=$(get_proc "meminfo")
+    set -- $(echo "$_meminfo" | awk '
+$1 == "MemAvailable:" { memavail += $2 }
+$1 == "MemFree:"      { memfree  += $2 }
+$1 == "Cached:"       { memfree  += $2 }
+$1 == "Buffers:"      { memfree  += $2 }
+$1 == "MemTotal:"     { memtotal  = $2 }
+$1 == "SwapFree:"     { swapfree  = $2 }
+$1 == "SwapTotal:"    { swaptotal = $2 }
+END {
+    if (memavail != 0) { memfree = memavail ; }
+    print int((memtotal -  memfree)  / memtotal * 100),
+          int((swaptotal - swapfree) / swaptotal * 100)
+}')
+    _mem_usage="$1"
+    _swap_usage="$2"
+
+    check_thresholds "System memory" \
+                    "$CTDB_MONITOR_MEMORY_USAGE" \
+                    "$_mem_usage" \
+                    dump_memory_info
+
+    check_thresholds "System swap" \
+                    "$CTDB_MONITOR_SWAP_USAGE" \
+                    "$_swap_usage" \
+                    dump_memory_info
+}
+
+
+case "$1" in
+    monitor)
+       monitor_filesystem_usage
+       monitor_memory_usage
+       ;;
+
+    *)
+       ctdb_standard_event_handler "$@"
+       ;;
+esac
+
+exit 0
diff --git a/ctdb/config/events.d/40.fs_use b/ctdb/config/events.d/40.fs_use
deleted file mode 100644
index 603b463..0000000
--- a/ctdb/config/events.d/40.fs_use
+++ /dev/null
@@ -1,55 +0,0 @@
-#!/bin/sh
-# ctdb event script for checking local file system utilization
-
-[ -n "$CTDB_BASE" ] || \
-    export CTDB_BASE=$(cd -P $(dirname "$0") ; dirname "$PWD")
-
-. $CTDB_BASE/functions
-loadconfig
-
-case "$1" in 
-    monitor)
-        # check each specified fs to be checked
-        # config format is <fs_mount>:<fs_threshold>
-        for fs in $CTDB_CHECK_FS_USE
-        do
-            # parse fs_mount and fs_threshold
-            fs_mount="${fs%:*}"
-            fs_threshold="${fs#*:}"
-
-            # check if given fs_mount is existing directory
-            if [ ! -d "$fs_mount" ]; then
-                echo "Directory $fs_mount does not exist"
-                exit 1
-            fi
-
-            # check if given fs_threshold is number
-            if ! (echo "$fs_threshold" | egrep -q '^[0-9]+$')  ; then
-                echo "Threshold $fs_threshold is invalid number"
-                exit 1
-            fi
-
-            # get utilization of given fs from df
-            fs_usage=$(df -kP $fs_mount | sed -n -e 
's@.*[[:space:]]\([[:digit:]]*\)%.*@\1@p')
-
-            # check if fs_usage is number
-            if [ -z "$fs_usage" ] ; then
-                echo "Unable to get FS utilization for $fs_mount"
-                exit 1
-            fi
-
-            # check if fs_usage is higher than or equal to fs_threshold
-            if [ "$fs_usage" -ge "$fs_threshold" ] ; then
-                echo "ERROR: Utilization of $fs_mount ($fs_usage%) is higher 
than threshold ($fs_threshold%)"
-                exit 1
-            fi
-        done
-
-       ;;
-
-    *)
-       ctdb_standard_event_handler "$@"
-       ;;
-esac
-
-exit 0
diff --git a/ctdb/doc/ctdbd.conf.5.xml b/ctdb/doc/ctdbd.conf.5.xml
index da53e51..f45c724 100644
--- a/ctdb/doc/ctdbd.conf.5.xml
+++ b/ctdb/doc/ctdbd.conf.5.xml
@@ -1279,91 +1279,91 @@ CTDB_PER_IP_ROUTING_TABLE_ID_HIGH=9000
 
       <para>
        CTDB can experience seemingly random (performance and other)
-       issues if system resources become too contrained.  Options in
-       this section can be enabled to allow certain system resources to
-       be checked.
+       issues if system resources become too constrained.  Options in
+       this section can be enabled to allow certain system resources
+       to be checked.  They allows warnings to be logged and nodes to
+       be marked unhealthy when system resource usage reaches the
+       configured thresholds.
+      </para>
+
+      <para>
+       Some checks are enabled by default.  It is recommended that
+       these checks remain enabled or are augmented by extra checks.
+       There is no supported way of completely disabling the checks.
       </para>
 
       <refsect3>
        <title>Eventscripts</title>
 
        <simplelist>
-         <member><filename>00.ctdb</filename></member>
-         <member><filename>40.fs_use</filename></member>
+         <member><filename>05.system</filename></member>
        </simplelist>
 
        <para>
-         Filesystem usage monitoring is in
-         <filename>40.fs_use</filename>.  This eventscript is not
-         enabled by default.  Use <command>ctdb
-         enablescript</command> to enable it.
+         Filesystem and memory usage monitoring is in
+         <filename>05.system</filename>.
        </para>
       </refsect3>
 
       <variablelist>
 
        <varlistentry>
-         <term>CTDB_CHECK_FS_USE=<parameter>FS-LIMIT-LIST</parameter></term>
+         
<term>CTDB_MONITOR_FILESYSTEM_USAGE=<parameter>FS-LIMIT-LIST</parameter></term>
          <listitem>
            <para>
              FS-LIMIT-LIST is a space-separated list of
-             <parameter>FILESYSTEM</parameter>:<parameter>LIMIT</parameter>
-             pairs indicating that a node should be flagged unhealthy
-             if the space used on FILESYSTEM reaches LIMIT%.
-           </para>
-
-           <para>
-             No default.
+             
<parameter>FILESYSTEM</parameter>:<parameter>WARN_LIMIT</parameter><optional>:<parameter>UNHEALTHY_LIMIT</parameter></optional>
+             triples indicating that warnings should be logged if the
+             space used on FILESYSTEM reaches WARN_LIMIT%.  If usage
+             reaches UNHEALTHY_LIMIT then the node should be flagged
+             unhealthy.  Either WARN_LIMIT or UNHEALTHY_LIMIT may be
+             left blank, meaning that check will be omitted.
            </para>
 
            <para>
-             Note that this feature uses the
-             <filename>40.fs_use</filename> eventscript, which is not
-             enabled by default.  Use <command>ctdb
-             enablescript</command> to enable it.
+             Default is to warn for each filesystem containing a
+             database directory (<envar>CTDB_DBDIR</envar>,
+             <envar>CTDB_DBDIR_PERSISTENT</envar>,
+             <envar>CTDB_DBDIR_STATE</envar>) with a threshold of
+             90%.
            </para>
          </listitem>
        </varlistentry>
 
        <varlistentry>
-         <term>CTDB_CHECK_SWAP_IS_NOT_USED=yes|no</term>
+         
<term>CTDB_MONITOR_MEMORY_USAGE=<parameter>MEM-LIMITS</parameter></term>
          <listitem>
            <para>
-             Should a warning be logged if swap space is in use.
+             MEM-LIMITS takes the form
+             
<parameter>WARN_LIMIT</parameter><optional>:<parameter>UNHEALTHY_LIMIT</parameter></optional>
+             indicating that warnings should be logged if memory
+             usage reaches WARN_LIMIT%.  If usage reaches
+             UNHEALTHY_LIMIT then the node should be flagged
+             unhealthy.  Either WARN_LIMIT or UNHEALTHY_LIMIT may be
+             left blank, meaning that check will be omitted.
            </para>
            <para>
-             Default is no.
+             Default is 80, so warnings will be logged when memory
+             usage reaches 80%.
            </para>
          </listitem>
        </varlistentry>
 
        <varlistentry>
-         <term>CTDB_MONITOR_FREE_MEMORY=<parameter>NUM</parameter></term>
+         
<term>CTDB_MONITOR_SWAP_USAGE=<parameter>SWAP-LIMITS</parameter></term>
          <listitem>
            <para>
-             NUM is a lower limit on available system memory, expressed
-             in megabytes.  If this is set and the amount of available
-             memory falls below this limit then some debug information
-             will be logged, the node will be disabled and then CTDB
-             will be shut down.
+             SWAP-LIMITS takes the form
+             
<parameter>WARN_LIMIT</parameter><optional>:<parameter>UNHEALTHY_LIMIT</parameter></optional>
+              indicating that warnings should be logged if
+             swap usage reaches WARN_LIMIT%.  If usage reaches
+             UNHEALTHY_LIMIT then the node should be flagged
+             unhealthy.  Either WARN_LIMIT or UNHEALTHY_LIMIT may be
+             left blank, meaning that check will be omitted.
            </para>
            <para>
-             No default.
-           </para>
-         </listitem>
-       </varlistentry>
-
-       <varlistentry>
-         <term>CTDB_MONITOR_FREE_MEMORY_WARN=<parameter>NUM</parameter></term>
-         <listitem>
-           <para>
-             NUM is a lower limit on available system memory, expressed
-             in megabytes.  If this is set and the amount of available
-             memory falls below this limit then a warning will be
-             logged.
-           </para>
-           <para>
-             No default.
+             Default is 25, so warnings will be logged when swap
+             usage reaches 25%.
            </para>
          </listitem>
        </varlistentry>
diff --git a/ctdb/packaging/RPM/ctdb.spec.in b/ctdb/packaging/RPM/ctdb.spec.in
index 00f0be5..318dacf 100644
--- a/ctdb/packaging/RPM/ctdb.spec.in
+++ b/ctdb/packaging/RPM/ctdb.spec.in
@@ -167,6 +167,7 @@ rm -rf $RPM_BUILD_ROOT
 %{_sysconfdir}/ctdb/functions
 %{_sysconfdir}/ctdb/events.d/00.ctdb
 %{_sysconfdir}/ctdb/events.d/01.reclock
+%{_sysconfdir}/ctdb/events.d/05.system
 %{_sysconfdir}/ctdb/events.d/10.interface
 %{_sysconfdir}/ctdb/events.d/10.external
 %{_sysconfdir}/ctdb/events.d/13.per_ip_routing
@@ -174,7 +175,6 @@ rm -rf $RPM_BUILD_ROOT
 %{_sysconfdir}/ctdb/events.d/11.routing
 %{_sysconfdir}/ctdb/events.d/20.multipathd
 %{_sysconfdir}/ctdb/events.d/31.clamd
-%{_sysconfdir}/ctdb/events.d/40.fs_use
 %{_sysconfdir}/ctdb/events.d/40.vsftpd
 %{_sysconfdir}/ctdb/events.d/41.httpd
 %{_sysconfdir}/ctdb/events.d/49.winbind
diff --git a/ctdb/tests/eventscripts/00.ctdb.monitor.001.sh 
b/ctdb/tests/eventscripts/00.ctdb.monitor.001.sh
deleted file mode 100755
index 4290d13..0000000
--- a/ctdb/tests/eventscripts/00.ctdb.monitor.001.sh
+++ /dev/null
@@ -1,15 +0,0 @@
-#!/bin/sh
-
-. "${TEST_SCRIPTS_DIR}/unit.sh"
-
-define_test "Memory check, bad situation, no checks enabled"
-
-setup_memcheck "bad"
-
-CTDB_MONITOR_FREE_MEMORY=""
-CTDB_MONITOR_FREE_MEMORY_WARN=""
-CTDB_CHECK_SWAP_IS_NOT_USED="no"
-
-ok_null
-
-simple_test
diff --git a/ctdb/tests/eventscripts/00.ctdb.monitor.002.sh 
b/ctdb/tests/eventscripts/00.ctdb.monitor.002.sh
deleted file mode 100755
index 6e94012..0000000
--- a/ctdb/tests/eventscripts/00.ctdb.monitor.002.sh
+++ /dev/null
@@ -1,15 +0,0 @@
-#!/bin/sh
-
-. "${TEST_SCRIPTS_DIR}/unit.sh"
-
-define_test "Memory check, good situation, all enabled"
-
-setup_memcheck


-- 
Samba Shared Repository

Reply via email to