The branch, master has been updated via d744eb0 ctdb-doc: Add reference to new magepage ctdb-statistics via efd34bb ctdb-doc: Add ctdb-statistics manual page via f5f11e1 ctdb-daemon: Decrement pending calls statistics when calls are deferred via 3c1bae1 ctdb-tests: Do not expect real-time priority when running local daemons via d410b20 ctdb-daemon: Make sure ctdb runs with real-time priority via 7ae7a9c ctdb-locking: Fork lock helper with vfork_with_logging() via 2e17b0e ctdb-locking: Add argc parameter to lock_helper_args() from 8c1b143 media_harmony: Fix a crash bug
http://gitweb.samba.org/?p=samba.git;a=shortlog;h=master - Log ----------------------------------------------------------------- commit d744eb03c5236284cf0141c1a2f687263cbd8414 Author: Amitay Isaacs <ami...@gmail.com> Date: Fri Sep 12 16:24:09 2014 +1000 ctdb-doc: Add reference to new magepage ctdb-statistics Signed-off-by: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Martin Schwenke <mar...@meltin.net> Autobuild-User(master): Amitay Isaacs <ami...@samba.org> Autobuild-Date(master): Fri Sep 12 11:13:56 CEST 2014 on sn-devel-104 commit efd34bb274a5ed015d7fe9374718671e0d7f9cc6 Author: Amitay Isaacs <ami...@gmail.com> Date: Fri Sep 12 14:22:00 2014 +1000 ctdb-doc: Add ctdb-statistics manual page Signed-off-by: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Martin Schwenke <mar...@meltin.net> commit f5f11e1a05d4d75a7662d6c413a14c4cd18f8ed9 Author: Amitay Isaacs <ami...@gmail.com> Date: Fri Sep 12 10:50:27 2014 +1000 ctdb-daemon: Decrement pending calls statistics when calls are deferred Deferred calls should not be treated as pending calls since they are re-processed from the beginning. Signed-off-by: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Martin Schwenke <mar...@meltin.net> commit 3c1bae12217ead74863a7cdd9b8a338aef80adb1 Author: Amitay Isaacs <ami...@gmail.com> Date: Fri Sep 12 11:25:14 2014 +1000 ctdb-tests: Do not expect real-time priority when running local daemons Local daemons are started mainly for testing and usually not as root. Signed-off-by: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Martin Schwenke <mar...@meltin.net> commit d410b20601cccd8b67d48c42a6d689cd65e94f61 Author: Amitay Isaacs <ami...@gmail.com> Date: Fri Sep 12 11:22:36 2014 +1000 ctdb-daemon: Make sure ctdb runs with real-time priority Signed-off-by: Amitay Isaacs <ami...@gmail.com> Reviewed-by: Martin Schwenke <mar...@meltin.net> commit 7ae7a9c46301e4fed870516c448a79bb7a9ac53a Author: Martin Schwenke <mar...@meltin.net> Date: Wed Aug 13 15:01:54 2014 +1000 ctdb-locking: Fork lock helper with vfork_with_logging() Otherwise errors printed by the lock helper get lost. lock_helper_args() no longer adds the program name to the list of arguments, since vfork_with_logging() does that. Update the lock helper to handle the extra log_fd parameter passed by vfork_with_logging() and send stdout/stderr there. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 2e17b0ecddffb8590c4e8b9afaf1767ef7e8f89c Author: Martin Schwenke <mar...@meltin.net> Date: Wed Aug 13 14:46:31 2014 +1000 ctdb-locking: Add argc parameter to lock_helper_args() To make this sane, also add an argv parameter and change the return type to bool. Anticipating a subsequent change, make the type of argv match what is needed by vfork_with_logging() and cast it when passing to execv(). This also means changing the type of the name member of struct db_namelist. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> ----------------------------------------------------------------------- Summary of changes: ctdb/common/system_util.c | 11 +- ctdb/doc/Makefile | 1 + ctdb/doc/ctdb-statistics.7.xml | 669 ++++++++++++++++++++++++++ ctdb/doc/ctdb.1.xml | 14 +- ctdb/doc/ctdb.7.xml | 3 + ctdb/include/ctdb_private.h | 2 +- ctdb/packaging/RPM/ctdb.spec.in | 1 + ctdb/server/ctdb_daemon.c | 8 +- ctdb/server/ctdb_lock.c | 77 ++-- ctdb/server/ctdb_lock_helper.c | 30 +- ctdb/tests/simple/scripts/local_daemons.bash | 2 +- ctdb/wscript | 4 +- 12 files changed, 759 insertions(+), 63 deletions(-) create mode 100644 ctdb/doc/ctdb-statistics.7.xml Changeset truncated at 500 lines: diff --git a/ctdb/common/system_util.c b/ctdb/common/system_util.c index 692bc25..8e8f4ac 100644 --- a/ctdb/common/system_util.c +++ b/ctdb/common/system_util.c @@ -37,7 +37,7 @@ /* if possible, make this task real time */ -void set_scheduler(void) +bool set_scheduler(void) { #ifdef _AIX_ #if HAVE_THREAD_SETSCHED @@ -47,14 +47,15 @@ void set_scheduler(void) ti = 0ULL; if (getthrds64(getpid(), &te, sizeof(te), &ti, 1) != 1) { DEBUG(DEBUG_ERR, ("Unable to get thread information\n")); - return; + return false; } if (thread_setsched(te.ti_tid, 0, SCHED_RR) == -1) { DEBUG(DEBUG_ERR, ("Unable to set scheduler to SCHED_RR (%s)\n", strerror(errno))); + return false; } else { - DEBUG(DEBUG_NOTICE, ("Set scheduler to SCHED_RR\n")); + return true; } #endif #else /* no AIX */ @@ -70,11 +71,13 @@ void set_scheduler(void) if (sched_setscheduler(0, policy, &p) == -1) { DEBUG(DEBUG_CRIT,("Unable to set scheduler to SCHED_FIFO (%s)\n", strerror(errno))); + return false; } else { - DEBUG(DEBUG_NOTICE,("Set scheduler to SCHED_FIFO\n")); + return true; } #endif #endif + return false; } /* diff --git a/ctdb/doc/Makefile b/ctdb/doc/Makefile index 34303a5..b2240a3 100644 --- a/ctdb/doc/Makefile +++ b/ctdb/doc/Makefile @@ -6,6 +6,7 @@ DOCS = ctdb.1 ctdb.1.html \ ping_pong.1 ping_pong.1.html \ ctdbd.conf.5 ctdbd.conf.5.html \ ctdb.7 ctdb.7.html \ + ctdb-statistics.7 ctdb-statistics.7.html \ ctdb-tunables.7 ctdb-tunables.7.html all: $(DOCS) diff --git a/ctdb/doc/ctdb-statistics.7.xml b/ctdb/doc/ctdb-statistics.7.xml new file mode 100644 index 0000000..77301ab --- /dev/null +++ b/ctdb/doc/ctdb-statistics.7.xml @@ -0,0 +1,669 @@ +<?xml version="1.0" encoding="iso-8859-1"?> +<!DOCTYPE refentry + PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" + "http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd"> + +<refentry id="ctdb-statistics.7"> + + <refmeta> + <refentrytitle>ctdb-statistics</refentrytitle> + <manvolnum>7</manvolnum> + <refmiscinfo class="source">ctdb</refmiscinfo> + <refmiscinfo class="manual">CTDB - clustered TDB database</refmiscinfo> + </refmeta> + + <refnamediv> + <refname>ctdb-statistics</refname> + <refpurpose>CTDB statistics output</refpurpose> + </refnamediv> + + <refsect1> + <title>OVERALL STATISTICS</title> + + <para> + CTDB maintains information about various messages communicated + and some of the important operations per node. See the + <citerefentry><refentrytitle>ctdb</refentrytitle> + <manvolnum>1</manvolnum></citerefentry> commands + <command>statistics</command> and <command>statisticsreset</command> + for displaying statistics. + </para> + + <refsect2> + <title>Example: ctdb statistics</title> + <screen> +CTDB version 1 +Current time of statistics : Fri Sep 12 13:32:32 2014 +Statistics collected since : (000 01:49:20) Fri Sep 12 11:43:12 2014 + num_clients 6 + frozen 0 + recovering 0 + num_recoveries 2 + client_packets_sent 281293 + client_packets_recv 296317 + node_packets_sent 452387 + node_packets_recv 182394 + keepalive_packets_sent 3927 + keepalive_packets_recv 3928 + node + req_call 48605 + reply_call 1 + req_dmaster 23404 + reply_dmaster 24917 + reply_error 0 + req_message 958 + req_control 197513 + reply_control 153705 + client + req_call 130866 + req_message 770 + req_control 168921 + timeouts + call 0 + control 0 + traverse 0 + locks + num_calls 220 + num_current 0 + num_pending 0 + num_failed 0 + total_calls 130866 + pending_calls 0 + childwrite_calls 1 + pending_childwrite_calls 0 + memory_used 334490 + max_hop_count 18 + total_ro_delegations 2 + total_ro_revokes 2 + hop_count_buckets: 42816 5464 26 1 0 0 0 0 0 0 0 0 0 0 0 0 + lock_buckets: 9 165 14 15 7 2 2 0 0 0 0 0 0 0 0 0 + locks_latency MIN/AVG/MAX 0.000685/0.160302/6.369342 sec out of 214 + reclock_ctdbd MIN/AVG/MAX 0.004940/0.004969/0.004998 sec out of 2 + reclock_recd MIN/AVG/MAX 0.000000/0.000000/0.000000 sec out of 0 + call_latency MIN/AVG/MAX 0.000006/0.000719/4.562991 sec out of 126626 + childwrite_latency MIN/AVG/MAX 0.014527/0.014527/0.014527 sec out of 1 + </screen> + </refsect2> + + <refsect2> + <title>CTDB version</title> + <para> + Version of the ctdb protocol used by the node. + </para> + </refsect2> + + <refsect2> + <title>Current time of statistics</title> + <para> + Time when the statistics are generated. + </para> + <para> + This is useful when collecting statistics output periodically + for post-processing. + </para> + </refsect2> + + <refsect2> + <title>Statistics collected since</title> + <para> + Time when ctdb was started or the last time statistics was reset. + The output shows the duration and the timestamp. + </para> + </refsect2> + + <refsect2> + <title>num_clients</title> + <para> + Number of processes currently connected to CTDB's unix socket. + This includes recovery daemon, ctdb tool and samba processes + (smbd, winbindd). + </para> + </refsect2> + + <refsect2> + <title>frozen</title> + <para> + 1 if the the databases are currently frozen, 0 otherwise. + </para> + </refsect2> + + <refsect2> + <title>recovering</title> + <para> + 1 if recovery is active, 0 otherwise. + </para> + </refsect2> + + <refsect2> + <title>num_recoveries</title> + <para> + Number of recoveries since the start of ctdb or since the last + statistics reset. + </para> + </refsect2> + + <refsect2> + <title>client_packets_sent</title> + <para> + Number of packets sent to client processes via unix domain socket. + </para> + </refsect2> + + <refsect2> + <title>client_packets_recv</title> + <para> + Number of packets received from client processes via unix domain socket. + </para> + </refsect2> + + <refsect2> + <title>node_packets_sent</title> + <para> + Number of packets sent to the other nodes in the cluster via TCP. + </para> + </refsect2> + + <refsect2> + <title>node_packets_recv</title> + <para> + Number of packets received from the other nodes in the cluster via TCP. + </para> + </refsect2> + + <refsect2> + <title>keepalive_packets_sent</title> + <para> + Number of keepalive messages sent to other nodes. + </para> + <para> + CTDB periodically sends keepalive messages to other nodes. + See <citetitle>KeepaliveInterval</citetitle> tunable in + <citerefentry><refentrytitle>ctdb-tunables</refentrytitle> + <manvolnum>7</manvolnum></citerefentry> for more details. + </para> + </refsect2> + + <refsect2> + <title>keepalive_packets_recv</title> + <para> + Number of keepalive messages received from other nodes. + </para> + </refsect2> + + <refsect2> + <title>node</title> + <para> + This section lists various types of messages processed which + originated from other nodes via TCP. + </para> + + <refsect3> + <title>req_call</title> + <para> + Number of REQ_CALL messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>reply_call</title> + <para> + Number of REPLY_CALL messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>req_dmaster</title> + <para> + Number of REQ_DMASTER messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>reply_dmaster</title> + <para> + Number of REPLY_DMASTER messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>reply_error</title> + <para> + Number of REPLY_ERROR messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>req_message</title> + <para> + Number of REQ_MESSAGE messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>req_control</title> + <para> + Number of REQ_CONTROL messages from the other nodes. + </para> + </refsect3> + + <refsect3> + <title>reply_control</title> + <para> + Number of REPLY_CONTROL messages from the other nodes. + </para> + </refsect3> + + </refsect2> + + <refsect2> + <title>client</title> + <para> + This section lists various types of messages processed which + originated from clients via unix domain socket. + </para> + + <refsect3> + <title>req_call</title> + <para> + Number of REQ_CALL messages from the clients. + </para> + </refsect3> + + <refsect3> + <title>req_message</title> + <para> + Number of REQ_MESSAGE messages from the clients. + </para> + </refsect3> + + <refsect3> + <title>req_control</title> + <para> + Number of REQ_CONTROL messages from the clients. + </para> + </refsect3> + + </refsect2> + + <refsect2> + <title>timeouts</title> + <para> + This section lists timeouts occurred when sending various messages. + </para> + + <refsect3> + <title>call</title> + <para> + Number of timeouts for REQ_CALL messages. + </para> + </refsect3> + + <refsect3> + <title>control</title> + <para> + Number of timeouts for REQ_CONTROL messages. + </para> + </refsect3> + + <refsect3> + <title>traverse</title> + <para> + Number of timeouts for database traverse operations. + </para> + </refsect3> + </refsect2> + + <refsect2> + <title>locks</title> + <para> + This section lists locking statistics. + </para> + + <refsect3> + <title>num_calls</title> + <para> + Number of completed lock calls. This includes database locks + and record locks. + </para> + </refsect3> + + <refsect3> + <title>num_current</title> + <para> + Number of scheduled lock calls. This includes database locks + and record locks. + </para> + </refsect3> + + <refsect3> + <title>num_pending</title> + <para> + Number of queued lock calls. This includes database locks and + record locks. + </para> + </refsect3> + + <refsect3> + <title>num_failed</title> + <para> + Number of failed lock calls. This includes database locks and + record locks. + </para> + </refsect3> + + </refsect2> + + <refsect2> + <title>total_calls</title> + <para> + Number of req_call messages processed from clients. This number + should be same as client --> req_call. + </para> + </refsect2> + + <refsect2> + <title>pending_calls</title> + <para> + Number of req_call messages which are currenly being processed. + This number indicates the number of record migrations in flight. + </para> + </refsect2> + + <refsect2> + <title>childwrite_calls</title> + <para> + Number of record update calls. Record update calls are used to + update a record under a transaction. + </para> + </refsect2> + + <refsect2> + <title>pending_childwrite_calls</title> + <para> + Number of record update calls currently active. + </para> + </refsect2> + + <refsect2> + <title>memory_used</title> + <para> + The amount of memory in bytes currently used by CTDB using + talloc. This includes all the memory used for CTDB's internal + data structures. This does not include the memory mapped TDB + databases. + </para> + </refsect2> + + <refsect2> + <title>max_hop_count</title> + <para> + The maximum number of hops required for a record migration request + to obtain the record. High numbers indicate record contention. + </para> + </refsect2> + + <refsect2> + <title>total_ro_delegations</title> + <para> + Number of readonly delegations created. + </para> + </refsect2> + + <refsect2> + <title>total_ro_revokes</title> + <para> + Number of readonly delegations that were revoked. The difference + between total_ro_revokes and total_ro_delegations gives the + number of currently active readonly delegations. + </para> + </refsect2> + + <refsect2> + <title>hop_count_buckets</title> + <para> + Distribution of migration requests based on hop counts values. + Buckets are 1, < 4, < 8, < 16, < 32, < 64, < + 128, < 256, < 512, ≥ 512. + </para> + </refsect2> + + <refsect2> + <title>lock_buckets</title> + <para> + Distribution of record lock requests based on time required to + obtain locks. Buckets are < 1ms, < 10ms, < 100ms, + < 1s, < 2s, < 4s, < 8s, < 16s, < 32s, < + 64s, ≥ 64s. -- Samba Shared Repository