The branch, master has been updated via 8d787f7 tdb: Align integer types via 608df97 gencache: Prune expired entries via c13eb55 gencache: Wipe corrupt databases via 1386200 gencache: Remove transaction-based tdb via a1e13b4 gencache: Add crc check via 0d7f67f gencache: Convert to a binary timestamp via 72ec893 tdb: Allow !CLEAR_IF_FIRST & MUTEX_LOCKING via e814fb6 tdb: Version 1.3.17 for tdb_traverse_chain via 3daf9c8 tdb: Add test for tdb_traverse_chain via 46a87f2 tdb: Add tdb_traverse_chain from 6e16e95 ctdb-daemon: Do not fork when CTDB_TEST_MODE is set
https://git.samba.org/?p=samba.git;a=shortlog;h=master - Log ----------------------------------------------------------------- commit 8d787f73bbcaff181583eadb9b15dcaf863c032c Author: Volker Lendecke <v...@samba.org> Date: Sat Nov 3 10:11:26 2018 +0100 tdb: Align integer types tdb->max_dead_records is "int", as is the corresponding parameter to tdb_set_max_dead(). Not that a signed variable makes any sense, but this is old code and tdb_set_max_dead() is a public API which we should not change for this. Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> Autobuild-User(master): Jeremy Allison <j...@samba.org> Autobuild-Date(master): Tue Nov 6 21:52:32 CET 2018 on sn-devel-144 commit 608df97d5dbc4e56456b83a28e3b2251a9aa4007 Author: Volker Lendecke <v...@samba.org> Date: Wed Oct 24 10:51:40 2018 +0200 gencache: Prune expired entries This solves the problem that gencache never shrinks right now. Whenever we write an entry, we now walk that entry's chain and delete expired entries. This should be a good balance between performance and cleanup actions: Reading is still unaffected, and those who write pay a small penalty while keeping gencache size under control. Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> commit c13eb55253a00d9472c749ccf58dd13e0efd1518 Author: Volker Lendecke <v...@samba.org> Date: Fri Nov 2 16:58:53 2018 +0100 gencache: Wipe corrupt databases Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> commit 1386200be5c583c680c3894a11688a0e0a3d2285 Author: Volker Lendecke <v...@samba.org> Date: Thu Oct 11 12:52:40 2018 +0200 gencache: Remove transaction-based tdb At more than one large site I've seen significant problems due to gencache_stabilize. gencache_stabilize was mainly introduced to survive machine crashes with the cache still being in place. Given that most installations crash rarely and this is still a cache, this safety is overkill and causes real problems. With the recent changes to tdb, we should be safe enough to run on completely corrupted databases and properly detect errors. A further commit will introduce code that wipes the gencache.tdb if such a corruption is detected. There is one kind of corruption that we don't properly handle: Orphaned space in the database. I don't have a good idea yet how to handle this in a graceful and efficient way during normal operations, but maybe this idea pops up at some point. Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> commit a1e13b4a5dd98bd13790c52e5745b8d04e1bda15 Author: Volker Lendecke <v...@samba.org> Date: Wed Oct 10 16:53:10 2018 +0200 gencache: Add crc check This covers key, timestamp and data. This will detect silent corruption of gencache data after a system crash Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> commit 0d7f67f07cd96943dfe58e91254145e28e047ace Author: Volker Lendecke <v...@samba.org> Date: Wed Oct 10 16:12:28 2018 +0200 gencache: Convert to a binary timestamp Two reasons: The ascii conversion shows up on profiles. In a further commit we will get checksums for gencache entries to protect at hidden corruption due to a crash on the non-transactioned gencache.tdb. Next to the timestamp this is a second field that is gencache metadata, and I don't want to deal with a second ascii number when at least some of the gencache values are binary already. Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> commit 72ec893d0aaa518a2cf22aa68b16c3907a17dfc8 Author: Volker Lendecke <v...@samba.org> Date: Mon Oct 22 08:57:00 2018 +0200 tdb: Allow !CLEAR_IF_FIRST & MUTEX_LOCKING This is a prerequisite to allow gencache to run on a non-transactioned database with mutexes. Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> commit e814fb6bcba8687dafb7d0d1654df8225b2e6229 Author: Volker Lendecke <v...@samba.org> Date: Mon Oct 29 07:43:43 2018 +0100 tdb: Version 1.3.17 for tdb_traverse_chain Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> commit 3daf9c85b742ec9365faa123e38e23a0d80a69a0 Author: Volker Lendecke <v...@samba.org> Date: Sun Oct 7 22:03:09 2018 +0200 tdb: Add test for tdb_traverse_chain Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> commit 46a87f2cbafdc84199e7e131f1d88961b37a2f78 Author: Volker Lendecke <v...@samba.org> Date: Sun Oct 28 09:06:39 2018 +0100 tdb: Add tdb_traverse_chain This is a lightweight readonly traverse of a single chain, see the comment in the header file. Signed-off-by: Volker Lendecke <v...@samba.org> Reviewed-by: Jeremy Allison <j...@samba.org> ----------------------------------------------------------------------- Summary of changes: lib/tdb/ABI/{tdb-1.3.15.sigs => tdb-1.3.17.sigs} | 2 + lib/tdb/common/freelist.c | 2 +- lib/tdb/common/open.c | 91 ++-- lib/tdb/common/traverse.c | 110 +++++ lib/tdb/include/tdb.h | 67 +++ lib/tdb/test/run-mutex-openflags2.c | 7 - lib/tdb/test/run-traverse-chain.c | 94 ++++ lib/tdb/wscript | 3 +- source3/lib/gencache.c | 566 ++++++----------------- source3/lib/gencache.h | 1 - source3/nmbd/nmbd.c | 2 - source3/smbd/server_exit.c | 1 - source3/torture/test_namemap_cache.c | 2 - source3/utils/net.c | 2 - source3/utils/net_cache.c | 26 -- source3/winbindd/winbindd.c | 2 - 16 files changed, 465 insertions(+), 513 deletions(-) copy lib/tdb/ABI/{tdb-1.3.15.sigs => tdb-1.3.17.sigs} (95%) create mode 100644 lib/tdb/test/run-traverse-chain.c Changeset truncated at 500 lines: diff --git a/lib/tdb/ABI/tdb-1.3.15.sigs b/lib/tdb/ABI/tdb-1.3.17.sigs similarity index 95% copy from lib/tdb/ABI/tdb-1.3.15.sigs copy to lib/tdb/ABI/tdb-1.3.17.sigs index 61ce5e6..e2b0427 100644 --- a/lib/tdb/ABI/tdb-1.3.15.sigs +++ b/lib/tdb/ABI/tdb-1.3.17.sigs @@ -63,6 +63,8 @@ tdb_transaction_start_nonblock: int (struct tdb_context *) tdb_transaction_write_lock_mark: int (struct tdb_context *) tdb_transaction_write_lock_unmark: int (struct tdb_context *) tdb_traverse: int (struct tdb_context *, tdb_traverse_func, void *) +tdb_traverse_chain: int (struct tdb_context *, unsigned int, tdb_traverse_func, void *) +tdb_traverse_key_chain: int (struct tdb_context *, TDB_DATA, tdb_traverse_func, void *) tdb_traverse_read: int (struct tdb_context *, tdb_traverse_func, void *) tdb_unlock: int (struct tdb_context *, int, int) tdb_unlockall: int (struct tdb_context *) diff --git a/lib/tdb/common/freelist.c b/lib/tdb/common/freelist.c index 9064386..37a4c16 100644 --- a/lib/tdb/common/freelist.c +++ b/lib/tdb/common/freelist.c @@ -557,7 +557,7 @@ static bool tdb_alloc_dead( static void tdb_purge_dead(struct tdb_context *tdb, uint32_t hash) { - uint32_t max_dead_records = tdb->max_dead_records; + int max_dead_records = tdb->max_dead_records; tdb->max_dead_records = 0; diff --git a/lib/tdb/common/open.c b/lib/tdb/common/open.c index 899a2fc..be5f807 100644 --- a/lib/tdb/common/open.c +++ b/lib/tdb/common/open.c @@ -230,8 +230,6 @@ static bool check_header_hash(struct tdb_context *tdb, static bool tdb_mutex_open_ok(struct tdb_context *tdb, const struct tdb_header *header) { - int locked; - if (tdb->flags & TDB_NOLOCK) { /* * We don't look at locks, so it does not matter to have a @@ -240,37 +238,6 @@ static bool tdb_mutex_open_ok(struct tdb_context *tdb, return true; } - locked = tdb_nest_lock(tdb, ACTIVE_LOCK, F_WRLCK, - TDB_LOCK_NOWAIT|TDB_LOCK_PROBE); - - if ((locked == -1) && (tdb->ecode == TDB_ERR_LOCK)) { - /* - * CLEAR_IF_FIRST still active. The tdb was created on this - * host, so we can assume the mutex implementation is - * compatible. Important for tools like tdbdump on a still - * open locking.tdb. - */ - goto check_local_settings; - } - - /* - * We got the CLEAR_IF_FIRST lock. That means the database was - * potentially copied from somewhere else. The mutex implementation - * might be incompatible. - */ - - if (tdb_nest_unlock(tdb, ACTIVE_LOCK, F_WRLCK, false) == -1) { - /* - * Should not happen - */ - TDB_LOG((tdb, TDB_DEBUG_ERROR, "tdb_mutex_open_ok: " - "failed to release ACTIVE_LOCK on %s: %s\n", - tdb->name, strerror(errno))); - return false; - } - -check_local_settings: - if (!(tdb->flags & TDB_MUTEX_LOCKING)) { TDB_LOG((tdb, TDB_DEBUG_ERROR, "tdb_mutex_open_ok[%s]: " "Can use mutexes only with " @@ -416,14 +383,6 @@ _PUBLIC_ struct tdb_context *tdb_open_ex(const char *name, int hash_size, int td * the runtime check for existing tdb's comes later. */ - if (!(tdb->flags & TDB_CLEAR_IF_FIRST)) { - TDB_LOG((tdb, TDB_DEBUG_ERROR, "tdb_open_ex: " - "invalid flags for %s - TDB_MUTEX_LOCKING " - "requires TDB_CLEAR_IF_FIRST\n", name)); - errno = EINVAL; - goto fail; - } - if (tdb->flags & TDB_INTERNAL) { TDB_LOG((tdb, TDB_DEBUG_ERROR, "tdb_open_ex: " "invalid flags for %s - TDB_MUTEX_LOCKING and " @@ -632,6 +591,30 @@ _PUBLIC_ struct tdb_context *tdb_open_ex(const char *name, int hash_size, int td * mutex locking. */ tdb->hdr_ofs = header.mutex_size; + + if ((!(tdb_flags & TDB_CLEAR_IF_FIRST)) && (!tdb->read_only)) { + /* + * Open an existing mutexed tdb, but without + * CLEAR_IF_FIRST. We need to initialize the + * mutex array and keep the CLEAR_IF_FIRST + * lock locked. + */ + ret = tdb_nest_lock(tdb, ACTIVE_LOCK, F_WRLCK, + TDB_LOCK_NOWAIT|TDB_LOCK_PROBE); + locked = (ret == 0); + + if (locked) { + ret = tdb_mutex_init(tdb); + if (ret == -1) { + TDB_LOG((tdb, + TDB_DEBUG_FATAL, + "tdb_open_ex: tdb_mutex_init " + "failed for ""%s: %s\n", + name, strerror(errno))); + goto fail; + } + } + } } if ((header.magic1_hash == 0) && (header.magic2_hash == 0)) { @@ -708,15 +691,19 @@ _PUBLIC_ struct tdb_context *tdb_open_ex(const char *name, int hash_size, int td goto fail; } + } - /* We always need to do this if the CLEAR_IF_FIRST flag is set, even if - we didn't get the initial exclusive lock as we need to let all other - users know we're using it. */ + if (locked || (tdb_flags & TDB_CLEAR_IF_FIRST)) { + /* + * We always need to do this if the CLEAR_IF_FIRST + * flag is set, even if we didn't get the initial + * exclusive lock as we need to let all other users + * know we're using it. + */ - if (tdb_flags & TDB_CLEAR_IF_FIRST) { - /* leave this lock in place to indicate it's in use */ - if (tdb_nest_lock(tdb, ACTIVE_LOCK, F_RDLCK, TDB_LOCK_WAIT) == -1) { + ret = tdb_nest_lock(tdb, ACTIVE_LOCK, F_RDLCK, TDB_LOCK_WAIT); + if (ret == -1) { goto fail; } } @@ -932,7 +919,10 @@ fail: seek pointer from our parent and to re-establish locks */ _PUBLIC_ int tdb_reopen(struct tdb_context *tdb) { - return tdb_reopen_internal(tdb, tdb->flags & TDB_CLEAR_IF_FIRST); + bool active_lock; + active_lock = (tdb->flags & (TDB_CLEAR_IF_FIRST|TDB_MUTEX_LOCKING)); + + return tdb_reopen_internal(tdb, active_lock); } /* reopen all tdb's */ @@ -941,7 +931,10 @@ _PUBLIC_ int tdb_reopen_all(int parent_longlived) struct tdb_context *tdb; for (tdb=tdbs; tdb; tdb = tdb->next) { - bool active_lock = (tdb->flags & TDB_CLEAR_IF_FIRST); + bool active_lock; + + active_lock = + (tdb->flags & (TDB_CLEAR_IF_FIRST|TDB_MUTEX_LOCKING)); /* * If the parent is longlived (ie. a diff --git a/lib/tdb/common/traverse.c b/lib/tdb/common/traverse.c index a9af1d4..54a69dc 100644 --- a/lib/tdb/common/traverse.c +++ b/lib/tdb/common/traverse.c @@ -399,3 +399,113 @@ _PUBLIC_ TDB_DATA tdb_nextkey(struct tdb_context *tdb, TDB_DATA oldkey) return key; } +_PUBLIC_ int tdb_traverse_chain(struct tdb_context *tdb, + unsigned chain, + tdb_traverse_func fn, + void *private_data) +{ + tdb_off_t rec_ptr; + struct tdb_chainwalk_ctx chainwalk; + int count = 0; + int ret; + + if (chain >= tdb->hash_size) { + tdb->ecode = TDB_ERR_EINVAL; + return -1; + } + + if (tdb->traverse_read != 0) { + tdb->ecode = TDB_ERR_LOCK; + return -1; + } + + ret = tdb_lock(tdb, chain, F_RDLCK); + if (ret == -1) { + return -1; + } + + tdb->traverse_read += 1; + + ret = tdb_ofs_read(tdb, TDB_HASH_TOP(chain), &rec_ptr); + if (ret == -1) { + goto fail; + } + + tdb_chainwalk_init(&chainwalk, rec_ptr); + + while (rec_ptr != 0) { + struct tdb_record rec; + bool ok; + + ret = tdb_rec_read(tdb, rec_ptr, &rec); + if (ret == -1) { + goto fail; + } + + if (!TDB_DEAD(&rec)) { + /* no overflow checks, tdb_rec_read checked it */ + tdb_off_t key_ofs = rec_ptr + sizeof(rec); + size_t full_len = rec.key_len + rec.data_len; + uint8_t *buf = NULL; + + TDB_DATA key = { .dsize = rec.key_len }; + TDB_DATA data = { .dsize = rec.data_len }; + + if ((tdb->transaction == NULL) && + (tdb->map_ptr != NULL)) { + ret = tdb->methods->tdb_oob( + tdb, key_ofs, full_len, 0); + if (ret == -1) { + goto fail; + } + key.dptr = (uint8_t *)tdb->map_ptr + key_ofs; + } else { + buf = tdb_alloc_read(tdb, key_ofs, full_len); + if (buf == NULL) { + goto fail; + } + key.dptr = buf; + } + data.dptr = key.dptr + key.dsize; + + ret = fn(tdb, key, data, private_data); + free(buf); + + count += 1; + + if (ret != 0) { + break; + } + } + + rec_ptr = rec.next; + + ok = tdb_chainwalk_check(tdb, &chainwalk, rec_ptr); + if (!ok) { + goto fail; + } + } + tdb->traverse_read -= 1; + tdb_unlock(tdb, chain, F_RDLCK); + return count; + +fail: + tdb->traverse_read -= 1; + tdb_unlock(tdb, chain, F_RDLCK); + return -1; +} + +_PUBLIC_ int tdb_traverse_key_chain(struct tdb_context *tdb, + TDB_DATA key, + tdb_traverse_func fn, + void *private_data) +{ + uint32_t hash, chain; + int ret; + + hash = tdb->hash_fn(&key); + chain = BUCKET(hash); + ret = tdb_traverse_chain(tdb, chain, fn, private_data); + + return ret; +} diff --git a/lib/tdb/include/tdb.h b/lib/tdb/include/tdb.h index 535f07a..445f48c 100644 --- a/lib/tdb/include/tdb.h +++ b/lib/tdb/include/tdb.h @@ -481,6 +481,73 @@ int tdb_traverse(struct tdb_context *tdb, tdb_traverse_func fn, void *private_da int tdb_traverse_read(struct tdb_context *tdb, tdb_traverse_func fn, void *private_data); /** + * @brief Traverse a single hash chain + * + * Traverse a single hash chain under a single lock operation. No + * database modification is possible in the callback. + * + * This exists for background cleanup of databases. In normal + * operations, traversing a complete database can be much too + * expensive. Databases can have many chains, which will all have to + * be looked at before tdb_traverse finishes. Also tdb_traverse does a + * lot of fcntl activity to protect against concurrent record deletes. + * + * With this you can walk a fraction of the whole tdb, collect the + * entries you want to prune, leave the traverse, and then modify or + * delete the records in a subsequent step. + * + * To walk the entire database, call this function tdb_hash_size() + * times, with 0<=chain<tdb_hash_size(tdb). + * + * @param[in] tdb The database to traverse. + * + * @param[in] chain The hash chain number to traverse. + * + * @param[in] fn The function to call on each entry. + * + * @param[in] private_data The private data which should be passed to the + * traversing function. + * + * @return The record count traversed, -1 on error. + */ + +int tdb_traverse_chain(struct tdb_context *tdb, + unsigned chain, + tdb_traverse_func fn, + void *private_data); + +/** + * @brief Traverse a single hash chain + * + * This is like tdb_traverse_chain(), but for all records that are in + * the same chain as the record corresponding to the key parameter. + * + * Use it for ongoing database maintenance under a lock. Whenever you + * are about to modify a database, you know which record you are going + * to write to. For that tdb_store(), an exclusive chainlock is taken + * behind the scenes. To utilize this exclusive lock for incremental + * database cleanup as well, tdb_chainlock() the key you are about to + * modify, then tdb_traverse_key_chain() to do the incremental + * maintenance, modify your record and tdb_chainunlock() the key + * again. + * + * @param[in] tdb The database to traverse. + * + * @param[in] key The record key to walk the chain for. + * + * @param[in] fn The function to call on each entry. + * + * @param[in] private_data The private data which should be passed to the + * traversing function. + * + * @return The record count traversed, -1 on error. + */ + +int tdb_traverse_key_chain(struct tdb_context *tdb, + TDB_DATA key, + tdb_traverse_func fn, + void *private_data); +/** * @brief Check if an entry in the database exists. * * @note 1 is returned if the key is found and 0 is returned if not found this diff --git a/lib/tdb/test/run-mutex-openflags2.c b/lib/tdb/test/run-mutex-openflags2.c index 6522ae4..89603e6 100644 --- a/lib/tdb/test/run-mutex-openflags2.c +++ b/lib/tdb/test/run-mutex-openflags2.c @@ -112,13 +112,6 @@ int main(int argc, char *argv[]) data.dsize = strlen("world"); data.dptr = discard_const_p(uint8_t, "world"); - tdb = tdb_open_ex("mutex-openflags2.tdb", 0, - TDB_INCOMPATIBLE_HASH| - TDB_MUTEX_LOCKING, - O_RDWR|O_CREAT, 0755, &nolog_ctx, NULL); - ok((tdb == NULL) && (errno == EINVAL), "TDB_MUTEX_LOCKING without " - "TDB_CLEAR_IF_FIRST should fail with EINVAL - %d", errno); - if (!runtime_support) { tdb = tdb_open_ex("mutex-openflags2.tdb", 0, TDB_CLEAR_IF_FIRST| diff --git a/lib/tdb/test/run-traverse-chain.c b/lib/tdb/test/run-traverse-chain.c new file mode 100644 index 0000000..2a25bec --- /dev/null +++ b/lib/tdb/test/run-traverse-chain.c @@ -0,0 +1,94 @@ +#include "../common/tdb_private.h" +#include "../common/io.c" +#include "../common/tdb.c" +#include "../common/lock.c" +#include "../common/freelist.c" +#include "../common/traverse.c" +#include "../common/transaction.c" +#include "../common/error.c" +#include "../common/open.c" +#include "../common/check.c" +#include "../common/hash.c" +#include "../common/mutex.c" +#include "tap-interface.h" +#include <stdlib.h> +#include "logging.h" + +static char keystr0[] = "x"; +static TDB_DATA key0 = { .dptr = (uint8_t *)keystr0, + .dsize = sizeof(keystr0) }; +static char valuestr0[] = "y"; +static TDB_DATA value0 = { .dptr = (uint8_t *)valuestr0, + .dsize = sizeof(valuestr0) }; + +static char keystr1[] = "aaa"; +static TDB_DATA key1 = { .dptr = (uint8_t *)keystr1, + .dsize = sizeof(keystr1) }; +static char valuestr1[] = "bbbbb"; +static TDB_DATA value1 = { .dptr = (uint8_t *)valuestr1, + .dsize = sizeof(valuestr1) }; + +static TDB_DATA *keys[] = { &key0, &key1 }; +static TDB_DATA *values[] = { &value0, &value1 }; + +static bool tdb_data_same(TDB_DATA d1, TDB_DATA d2) +{ + if (d1.dsize != d2.dsize) { + return false; + } + return (memcmp(d1.dptr, d2.dptr, d1.dsize) == 0); +} + +struct traverse_chain_state { + size_t idx; + bool ok; +}; + +static int traverse_chain_fn(struct tdb_context *tdb, + TDB_DATA key, + TDB_DATA data, + void *private_data) +{ + struct traverse_chain_state *state = private_data; + + state->ok &= tdb_data_same(key, *keys[state->idx]); + state->ok &= tdb_data_same(data, *values[state->idx]); + state->idx += 1; + + return 0; +} + +int main(int argc, char *argv[]) +{ + struct tdb_context *tdb; + struct traverse_chain_state state = { .ok = true }; + int ret; + + plan_tests(4); + + tdb = tdb_open_ex( + "traverse_chain.tdb", + 1, + TDB_CLEAR_IF_FIRST, + O_RDWR|O_CREAT, + 0600, + &taplogctx, + NULL); + ok1(tdb); + + /* add in reverse order, tdb_store adds to the front of the list */ + ret = tdb_store(tdb, key1, value1, TDB_INSERT); + ok1(ret == 0); + ret = tdb_store(tdb, key0, value0, TDB_INSERT); + ok1(ret == 0); + + ret = tdb_traverse_key_chain(tdb, key0, traverse_chain_fn, &state); + ok1(ret == 2); + ok1(state.ok); + + unlink(tdb_name(tdb)); + + tdb_close(tdb); + + return exit_status(); +} diff --git a/lib/tdb/wscript b/lib/tdb/wscript index 3ab21fc..6a6adab 100644 --- a/lib/tdb/wscript +++ b/lib/tdb/wscript @@ -1,7 +1,7 @@ #!/usr/bin/env python APPNAME = 'tdb' -VERSION = '1.3.16' +VERSION = '1.3.17' -- Samba Shared Repository