The branch, master has been updated via dde461868f7 ctdb-tests: Add tests for cluster mutex I/O timeout via 25d32ae97a6 ctdb-tests: Terminate event loop if lock is no longer held via 061315cc795 ctdb-mutex: Test the lock by locking a 2nd byte range via 97a1714ee94 ctdb-mutex: open() and fstat() when testing lock file via c07e81abf04 ctdb-mutex: Factor out function fcntl_lock_fd() via 9daf22a5c9d ctdb-mutex: Handle pings from lock checking child to parent via b5db2867913 ctdb-mutex: Do inode checks in a child process via 2ecdbcb22c6 ctdb-mutex: Rename wait_for_lost to lock_io_check via 7ab2e8f1278 ctdb-mutex: Rename recheck_time to recheck_interval via c396b615047 ctdb-mutex: Consistently use progname in error messages via a8da8810f14 ctdb-tests: Add tests for trivial FD monitoring via 8d04235f465 ctdb-common: Add trivial FD monitoring abstraction via f9467cdf3b5 ctdb-build: Link in backtrace support for ctdb_util_tests via 7a1c43fc745 ctdb-build: Separate test backtrace support into separate subsystem via b195e8c0d0c ctdb-build: Sort sources in ctdb-util and ctdb_unit_tests from 3efa56aa61d ctdb-daemon: Fix printing of tickle ACKs
https://git.samba.org/?p=samba.git;a=shortlog;h=master - Log ----------------------------------------------------------------- commit dde461868f7bacf10b2aa141acd609ca0c965209 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Feb 25 19:44:52 2022 +1100 ctdb-tests: Add tests for cluster mutex I/O timeout Block the locker helper child by taking a lock on the 2nd byte of the lock file. This will cause a ping timeout if the process is blocked for long enough. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> Autobuild-User(master): Martin Schwenke <mart...@samba.org> Autobuild-Date(master): Thu Jul 28 11:10:54 UTC 2022 on sn-devel-184 commit 25d32ae97a6d7d425eea6f2e9585a1596776493c Author: Martin Schwenke <mar...@meltin.net> Date: Mon Feb 28 16:11:18 2022 +1100 ctdb-tests: Terminate event loop if lock is no longer held Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 061315cc795d8615fd94d4b23934ca1bf3aecebc Author: Martin Schwenke <mar...@meltin.net> Date: Tue Feb 8 12:23:42 2022 +1100 ctdb-mutex: Test the lock by locking a 2nd byte range Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 97a1714ee9427ca22201ebcc5201817d59f17764 Author: Martin Schwenke <mar...@meltin.net> Date: Tue Feb 8 12:15:26 2022 +1100 ctdb-mutex: open() and fstat() when testing lock file This makes a file descriptor available for other I/O. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit c07e81abf04c20fb591376efcaa9b738a60c1a58 Author: Martin Schwenke <mar...@meltin.net> Date: Tue Feb 8 11:56:46 2022 +1100 ctdb-mutex: Factor out function fcntl_lock_fd() Allows blocking mode and start offset to be specified. Always locks a 1-byte range. Make the lock structure static to avoid initialising the whole structure each time. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 9daf22a5c9d4ccdd706de883141ed807cab4df92 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Jan 28 13:49:48 2022 +1100 ctdb-mutex: Handle pings from lock checking child to parent The ping timeout is specified by passing an extra argument to the mutex helper, representing the ping timeout in seconds. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit b5db2867913ba451285728844a1725c9ba5e56c0 Author: Martin Schwenke <mar...@meltin.net> Date: Fri Jan 21 13:37:17 2022 +1100 ctdb-mutex: Do inode checks in a child process In future this will allow extra I/O tests and a timeout in the parent to (hopefully) release the lock if the child gets wedged. For simplicity, use tmon only to detect when either parent or child goes away. Plumbing a timeout for pings from child to parent will be done later. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 2ecdbcb22c69a94576c054761053971774c52099 Author: Martin Schwenke <mar...@meltin.net> Date: Tue Feb 8 09:35:17 2022 +1100 ctdb-mutex: Rename wait_for_lost to lock_io_check This will be generalised to do more I/O-based checks. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 7ab2e8f127859a436ceb24e7c0a5653ae79b2de5 Author: Martin Schwenke <mar...@meltin.net> Date: Wed Jan 19 12:09:07 2022 +1100 ctdb-mutex: Rename recheck_time to recheck_interval There will be more timeouts so clarify the intent of this one. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit c396b6150473f87b5767c0cdb3838fd08ebcb4dc Author: Martin Schwenke <mar...@meltin.net> Date: Tue Mar 1 09:58:22 2022 +1100 ctdb-mutex: Consistently use progname in error messages To avoid error messages having ridiculously long paths, set progname to basename(argv[0]). Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit a8da8810f1457fdfeedaf0a7822016b326b278be Author: Martin Schwenke <mar...@meltin.net> Date: Wed Feb 2 21:47:59 2022 +1100 ctdb-tests: Add tests for trivial FD monitoring tmon_ping_test covers complex 2-way interaction between processes using tmon_ping_send(), including via a socketpair(). tmon_test covers the more general functionality of tmon_send() but uses a simpler 1-way harness with wide coverage. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 8d04235f4656b9fbd62a00b9663eedafb18956d9 Author: Martin Schwenke <mar...@meltin.net> Date: Tue Feb 1 11:44:48 2022 +1100 ctdb-common: Add trivial FD monitoring abstraction Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit f9467cdf3b522e7f50d3986eb942048baa9c2d28 Author: Martin Schwenke <mar...@meltin.net> Date: Wed May 4 09:21:38 2022 +1000 ctdb-build: Link in backtrace support for ctdb_util_tests Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit 7a1c43fc745c09e8dd95eae3de4cebc0678e4110 Author: Martin Schwenke <mar...@meltin.net> Date: Wed May 4 09:02:12 2022 +1000 ctdb-build: Separate test backtrace support into separate subsystem A convention when testing members of ctdb-util is to include the .c file so that static functions can potentially be tested. This means that such tests can't be linked against ctdb-util or duplicate symbols will be encountered. ctdb-tests-common depends on ctdb-client, which depends in turn on ctdb-util, so this can't be used to pull in backtrace support. Instead, make ctdb-tests-backtrace its own subsystem. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> commit b195e8c0d0c597be87387f477ebfb62244979e39 Author: Martin Schwenke <mar...@meltin.net> Date: Wed May 4 09:17:40 2022 +1000 ctdb-build: Sort sources in ctdb-util and ctdb_unit_tests Also, rename ctdb_unit_tests to ctdb_util_tests. The sorting makes it clear that only items from ctdb-util are tested here. Signed-off-by: Martin Schwenke <mar...@meltin.net> Reviewed-by: Amitay Isaacs <ami...@gmail.com> ----------------------------------------------------------------------- Summary of changes: ctdb/common/tmon.c | 602 +++++++++++++++++++++ ctdb/common/tmon.h | 218 ++++++++ ctdb/server/ctdb_mutex_fcntl_helper.c | 509 ++++++++++++++--- .../simple/cluster.015.reclock_remove_lock.sh | 2 +- .../simple/cluster.016.reclock_move_lock_dir.sh | 2 +- ctdb/tests/UNIT/cunit/cluster_mutex_002.sh | 32 +- ctdb/tests/UNIT/cunit/tmon_test_001.sh | 195 +++++++ ctdb/tests/UNIT/cunit/tmon_test_002.sh | 113 ++++ ctdb/tests/src/cluster_mutex_test.c | 100 +++- ctdb/tests/src/tmon_ping_test.c | 381 +++++++++++++ ctdb/tests/src/tmon_test.c | 395 ++++++++++++++ ctdb/wscript | 105 +++- 12 files changed, 2541 insertions(+), 113 deletions(-) create mode 100644 ctdb/common/tmon.c create mode 100644 ctdb/common/tmon.h create mode 100755 ctdb/tests/UNIT/cunit/tmon_test_001.sh create mode 100755 ctdb/tests/UNIT/cunit/tmon_test_002.sh create mode 100644 ctdb/tests/src/tmon_ping_test.c create mode 100644 ctdb/tests/src/tmon_test.c Changeset truncated at 500 lines: diff --git a/ctdb/common/tmon.c b/ctdb/common/tmon.c new file mode 100644 index 00000000000..87a55e3b1e9 --- /dev/null +++ b/ctdb/common/tmon.c @@ -0,0 +1,602 @@ +/* + Trivial FD monitoring + + Copyright (C) Martin Schwenke & Amitay Isaacs, DataDirect Networks 2022 + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see <http://www.gnu.org/licenses/>. +*/ + +#include "replace.h" + +#include <ctype.h> + +#include "lib/util/blocking.h" +#include "lib/util/sys_rw.h" +#include "lib/util/tevent_unix.h" +#include "lib/util/util.h" +#include "lib/util/smb_strtox.h" + +#include "lib/async_req/async_sock.h" + +#include "common/tmon.h" + + +enum tmon_message_type { + TMON_MSG_EXIT = 1, + TMON_MSG_ERRNO, + TMON_MSG_PING, + TMON_MSG_ASCII, + TMON_MSG_CUSTOM, +}; + +struct tmon_pkt { + enum tmon_message_type type; + uint16_t val; +}; + +struct tmon_buf { + uint8_t data[4]; +}; + +static void tmon_packet_push(struct tmon_pkt *pkt, + struct tmon_buf *buf) +{ + uint16_t type_n, val_n; + + type_n = htons(pkt->type); + val_n = htons(pkt->val); + memcpy(&buf->data[0], &type_n, 2); + memcpy(&buf->data[2], &val_n, 2); +} + +static void tmon_packet_pull(struct tmon_buf *buf, + struct tmon_pkt *pkt) +{ + uint16_t type_n, val_n; + + memcpy(&type_n, &buf->data[0], 2); + memcpy(&val_n, &buf->data[2], 2); + + pkt->type = ntohs(type_n); + pkt->val = ntohs(val_n); +} + +static int tmon_packet_write(int fd, struct tmon_pkt *pkt) +{ + struct tmon_buf buf; + ssize_t n; + + tmon_packet_push(pkt, &buf); + + n = sys_write(fd, &buf.data[0], sizeof(buf.data)); + if (n == -1) { + return errno; + } + return 0; +} + +bool tmon_set_exit(struct tmon_pkt *pkt) +{ + *pkt = (struct tmon_pkt) { + .type = TMON_MSG_EXIT, + }; + + return true; +} + +bool tmon_set_errno(struct tmon_pkt *pkt, int err) +{ + if (err < 0 && err > UINT16_MAX) { + return false; + } + + *pkt = (struct tmon_pkt) { + .type = TMON_MSG_ERRNO, + .val = (uint16_t)err, + }; + + return true; +} + +bool tmon_set_ping(struct tmon_pkt *pkt) +{ + *pkt = (struct tmon_pkt) { + .type = TMON_MSG_PING, + }; + + return true; +} + +bool tmon_set_ascii(struct tmon_pkt *pkt, char c) +{ + if (!isascii(c)) { + return false; + } + + *pkt = (struct tmon_pkt) { + .type = TMON_MSG_ASCII, + .val = (uint16_t)c, + }; + + return true; +} + +bool tmon_set_custom(struct tmon_pkt *pkt, uint16_t val) +{ + *pkt = (struct tmon_pkt) { + .type = TMON_MSG_CUSTOM, + .val = val, + }; + + return true; +} + +static bool tmon_parse_exit(struct tmon_pkt *pkt) +{ + if (pkt->type != TMON_MSG_EXIT) { + return false; + } + if (pkt->val != 0) { + return false; + } + + return true; +} + +static bool tmon_parse_errno(struct tmon_pkt *pkt, int *err) +{ + if (pkt->type != TMON_MSG_ERRNO) { + return false; + } + *err= (int)pkt->val; + + return true; +} + +bool tmon_parse_ping(struct tmon_pkt *pkt) +{ + if (pkt->type != TMON_MSG_PING) { + return false; + } + if (pkt->val != 0) { + return false; + } + + return true; +} + +bool tmon_parse_ascii(struct tmon_pkt *pkt, char *c) +{ + if (pkt->type != TMON_MSG_ASCII) { + return false; + } + if (!isascii((int)pkt->val)) { + return false; + } + *c = (char)pkt->val; + + return true; +} + +bool tmon_parse_custom(struct tmon_pkt *pkt, uint16_t *val) +{ + if (pkt->type != TMON_MSG_CUSTOM) { + return false; + } + *val = pkt->val; + + return true; +} + +struct tmon_state { + int fd; + int direction; + struct tevent_context *ev; + bool monitor_close; + unsigned long write_interval; + unsigned long read_timeout; + struct tmon_actions actions; + struct tevent_timer *timer; + void *private_data; +}; + +static void tmon_readable(struct tevent_req *subreq); +static bool tmon_set_timeout(struct tevent_req *req, + struct tevent_context *ev); +static void tmon_timedout(struct tevent_context *ev, + struct tevent_timer *te, + struct timeval now, + void *private_data); +static void tmon_write_loop(struct tevent_req *subreq); + +struct tevent_req *tmon_send(TALLOC_CTX *mem_ctx, + struct tevent_context *ev, + int fd, + int direction, + unsigned long read_timeout, + unsigned long write_interval, + struct tmon_actions *actions, + void *private_data) +{ + struct tevent_req *req, *subreq; + struct tmon_state *state; + bool status; + + req = tevent_req_create(mem_ctx, &state, struct tmon_state); + if (req == NULL) { + return NULL; + } + + if (actions != NULL) { + /* If FD isn't readable then read actions are invalid */ + if (!(direction & TMON_FD_READ) && + (actions->timeout_callback != NULL || + actions->read_callback != NULL || + read_timeout != 0)) { + tevent_req_error(req, EINVAL); + return tevent_req_post(req, ev); + } + /* If FD isn't writeable then write actions are invalid */ + if (!(direction & TMON_FD_WRITE) && + (actions->write_callback != NULL || + write_interval != 0)) { + tevent_req_error(req, EINVAL); + return tevent_req_post(req, ev); + } + /* Can't specify write interval without a callback */ + if (state->write_interval != 0 && + state->actions.write_callback == NULL) { + tevent_req_error(req, EINVAL); + return tevent_req_post(req, ev); + } + } + + state->fd = fd; + state->direction = direction; + state->ev = ev; + state->write_interval = write_interval; + state->read_timeout = read_timeout; + state->private_data = private_data; + + if (actions != NULL) { + state->actions = *actions; + } + + status = set_close_on_exec(fd); + if (!status) { + tevent_req_error(req, errno); + return tevent_req_post(req, ev); + } + + if (direction & TMON_FD_READ) { + subreq = wait_for_read_send(state, ev, fd, true); + if (tevent_req_nomem(subreq, req)) { + return tevent_req_post(req, ev); + } + tevent_req_set_callback(subreq, tmon_readable, req); + } + + if (state->read_timeout != 0) { + status = tmon_set_timeout(req, state->ev); + if (!status) { + tevent_req_error(req, ENOMEM); + return tevent_req_post(req, ev); + } + } + + if (state->write_interval != 0) { + subreq = tevent_wakeup_send( + state, + state->ev, + tevent_timeval_current_ofs(state->write_interval, 0)); + if (tevent_req_nomem(subreq, req)) { + return tevent_req_post(req, state->ev); + } + tevent_req_set_callback(subreq, tmon_write_loop, req); + } + + return req; +} + +static void tmon_readable(struct tevent_req *subreq) +{ + struct tevent_req *req = tevent_req_callback_data( + subreq, struct tevent_req); + struct tmon_state *state = tevent_req_data( req, struct tmon_state); + struct tmon_buf buf; + struct tmon_pkt pkt; + ssize_t nread; + bool status; + int err; + int ret; + + status = wait_for_read_recv(subreq, &ret); + TALLOC_FREE(subreq); + if (!status) { + if (ret == EPIPE && state->actions.close_callback != NULL) { + ret = state->actions.close_callback(state->private_data); + if (ret == TMON_STATUS_EXIT) { + ret = 0; + } + } + if (ret == 0) { + tevent_req_done(req); + } else { + tevent_req_error(req, ret); + } + return; + } + + nread = sys_read(state->fd, buf.data, sizeof(buf.data)); + if (nread == -1) { + tevent_req_error(req, errno); + return; + } + if (nread == 0) { + /* Can't happen, treat like EPIPE, above */ + tevent_req_error(req, EPIPE); + return; + } + if (nread != sizeof(buf.data)) { + tevent_req_error(req, EPROTO); + return; + } + + tmon_packet_pull(&buf, &pkt); + + switch (pkt.type) { + case TMON_MSG_EXIT: + status = tmon_parse_exit(&pkt); + if (!status) { + tevent_req_error(req, EPROTO); + return; + } + tevent_req_done(req); + return; + case TMON_MSG_ERRNO: + status = tmon_parse_errno(&pkt, &err); + if (!status) { + err = EPROTO; + } + tevent_req_error(req, err); + return; + default: + break; + } + + if (state->actions.read_callback == NULL) { + /* Shouldn't happen, other end should not write */ + tevent_req_error(req, EIO); + return; + } + ret = state->actions.read_callback(state->private_data, &pkt); + if (ret == TMON_STATUS_EXIT) { + tevent_req_done(req); + return; + } + if (ret != 0) { + tevent_req_error(req, ret); + return; + } + + subreq = wait_for_read_send(state, state->ev, state->fd, true); + if (tevent_req_nomem(subreq, req)) { + return; + } + tevent_req_set_callback(subreq, tmon_readable, req); + + /* Reset read timeout */ + if (state->read_timeout != 0) { + status = tmon_set_timeout(req, state->ev); + if (!status) { + tevent_req_error(req, ENOMEM); + return; + } + } +} + +static bool tmon_set_timeout(struct tevent_req *req, + struct tevent_context *ev) +{ + struct tmon_state *state = tevent_req_data( + req, struct tmon_state); + struct timeval endtime = + tevent_timeval_current_ofs(state->read_timeout, 0); + + TALLOC_FREE(state->timer); + + state->timer = tevent_add_timer(ev, req, endtime, tmon_timedout, req); + if (tevent_req_nomem(state->timer, req)) { + return false; + } + + return true; +} + +static void tmon_timedout(struct tevent_context *ev, + struct tevent_timer *te, + struct timeval now, + void *private_data) +{ + struct tevent_req *req = talloc_get_type_abort( + private_data, struct tevent_req); + struct tmon_state *state = tevent_req_data(req, struct tmon_state); + int ret; + + TALLOC_FREE(state->timer); + + if (state->actions.timeout_callback != NULL) { + ret = state->actions.timeout_callback(state->private_data); + if (ret == TMON_STATUS_EXIT) { + ret = 0; + } + } else { + ret = ETIMEDOUT; + } + + if (ret == 0) { + tevent_req_done(req); + } else { + tevent_req_error(req, ret); + } +} + +static void tmon_write_loop(struct tevent_req *subreq) +{ + struct tevent_req *req = tevent_req_callback_data( + subreq, struct tevent_req); + struct tmon_state *state = tevent_req_data( + req, struct tmon_state); + struct tmon_pkt pkt; + int ret; + bool status; + + status = tevent_wakeup_recv(subreq); + TALLOC_FREE(subreq); + if (!status) { + /* Ignore error */ + } + + ret = state->actions.write_callback(state->private_data, &pkt); + if (ret == TMON_STATUS_EXIT) { + tevent_req_done(req); + return; + } + if (ret == TMON_STATUS_SKIP) { + goto done; + } + if (ret != 0) { + tevent_req_error(req, ret); + return; + } + + status = tmon_write(req, &pkt); + if (!status) { + return; + } + +done: + subreq = tevent_wakeup_send( + state, + state->ev, + tevent_timeval_current_ofs(state->write_interval, 0)); + if (tevent_req_nomem(subreq, req)) { -- Samba Shared Repository