The annotated tag, ctdb-1.2.27-204.1 has been created at 895a0ddae46597a2bb369739a6e453bdb08ce3cf (tag) tagging 66a469daa2552f93219c2fc34aa8659f4b0c76d8 (commit) replaces ctdb-1.9.1 tagged by Martin Schwenke on Thu Aug 16 12:32:52 2012 +1000
- Log ----------------------------------------------------------------- Version 1.2.27-204.1 Andrew Tridgell (1): tdb: added TDB_NO_FSYNC env variable Chandra Seetharaman (1): make changes to ctdb event scripts to support NFS-Ganesha. Christian Ambach (1): improve timing issue detections Evan Kinney (1): ctdb: Fixed use of reserved word "private" in typedefs Günther Deschner (1): lib/tdb: fix c++ build warning in tdb_header_hash(). Harald Klatte (1): AIX bind wants the correct addrsize Jelmer Vernooij (3): pytdb: Make filename argument optional. pytdb: Include Python.h first to prevent warning. pytdb: Add __version__ attribute. Kirill Smelkov (9): pytdb: Add support for tdb_add_flags() & tdb_remove_flags() pytdb: Fix repr segfault for internal db pytdb: Update open flags to match those for tdb_open() in tdb.h pytdb: Add support for tdb_enable_seqnum, tdb_get_seqnum and tdb_increment_seqnum_nonblock pytdb: Add support for tdb_transaction_prepare_commit() pytdb: Add support for tdb_freelist_size() pytdb: Add TDB_INCOMPATIBLE_HASH open flag pytdb: Add support for tdb_repack() pytdb: Check errors after PyObject_New() calls Martin Schwenke (69): Test suite: handle change to disconnected node error message. Test suite: handle extra lines in statistics output. Optimise 61.nfstickle to write the tickles more efficiently. Testing: Add Python IP allocation simulation. Test suite: handle change to disconnected node error message. Test suite: handle extra lines in statistics output. Optimise 61.nfstickle to write the tickles more efficiently. Testing: Add Python IP allocation simulation. Merge branch 'master' of git://git.samba.org/sahlberg/ctdb Testing: Add imbalance information to IP allocation simulation. Testing: In IP allocation simulation count total number of events. Testing: IP allocation simulation prints final imbalance in statistics. Testing: IP allocation simulation - save some warnings for verbose mode. Testing: IP allocation simulation - add command line option for random seed. Testing: IP allocation simulation - update copyright message. Testing: IP allocation simulation - Tweak options handling and Cluster.diff(). Testing: IP allocation simulation - fix nondeterminism in do_something_random(). Testing: IP allocation simulation - Update README. Testing: IP allocation simulation - update options processing in examples. Testing: IP allocation simulation - add general node group example. Testing: IP allocation simulation - rename an example to node_group_simple.py. Testing: IP allocation simulation - rename an example to node_group_extra.py. Testing: IP allocation simulation - make usage/failure more obvious. Testing: IP allocation simulation - improve help for options. Testing: IP allocation simulation - print maximum number of unhealthy nodes. Testing: IP allocation simulation - clean up usage message. Testing: IP allocation simulation - add option to change odds of a failure. Test suite - try to make addip test more reliable and add some debugging. Merge remote branch 'martins/master' Test suite - fix addip test. Test suite: remove thaw/freeze tests. Test suite - make the ctdb_fetch test cope with "Reqid wrap!" messages. initscript: wait until we can ping ctdbd before setting tunables. Test suite: weaken ctdb continue/enable tests for non-deterministic IPs. Test suite: Fix typo in continue test. Test suite: remove unnecessary verbosity from enable/continue tests. Add some command-line options to ctdb_diagnostics. Test suite: make addip test use $CTDB rather than ctdb in debug code. Test suite: improve wait_until_node_has_status() Test suite: use $CTDB rather than ctdb everywhere in ctdb_test_functions.sh. Test suite: strengthen function _cluster_is_healthy(). Test suite: print date/time at test completion. Test suite: Add more timestamping of debugging information. Test suite: loosen the getmonmode test. Move NAT gateway firewall rules to recovered|updatenatgw events. Merge branch 'master' of git://git.samba.org/sahlberg/ctdb Merge branch 'master' of git://git.samba.org/sahlberg/ctdb Test suite: in the test eventscript, run "ctdb" not "$CTDB". NFS tickles: use addtickle/deltickle instead of shared tickle directory. Test suite: NFS tickle test uses gettickles if events.d/61.nfstickle missing. Test suite: Fix typos in NFS tickle test. Test suite: Tweak NFS tickle test. Test suite: Fix NFS tickle test. Test suite: Make NFS tickle test more flexible. Test suite: make statistics test cope with changes to statistics output. Test suite: match changed output for ctdb ping to disconnected node. Test suite: fix typo in ctdb ping test grep pattern. 60.nfs only fails or warns after 10 consecutive nfsd/statd failures. Make a time comparison in 60.nfs eventscript more readable. Eventscripts: make loadconfig() function hookable by the test suite. 50.samba eventscript should stop/start services when they become (un)managed. Eventscript functions - catch failures in ctdb_service_start(). 60.nfs eventscript should do nothing if NFS isn't managed by CTDB. Eventscripts: work around NFS restart failure under load. Eventscripts: print a message when reconfiguring a service. Eventscripts: only autostart during a monitor event. Eventscripts: use "startstop_nfs restart" to reconfigure NFS. Eventscripts: lower the fail/restart limits for nfsd. New version 1.2.27-204.1 Michael Adam (79): server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag persistent_callback: ignore the update-recordreturn code of remote node in recovery persistent_store_timout: do not really time out the trans3_commit control in recovery persistent: if a node failed to update_record, trigger a recovery persistent: reduce indentation for the finishing moves in ctdb_persistent_callback persistent_callback: print "no error message given" instead of "(null)" persistent: add a ctdb_persistent_state member to the ctdb_db context. persistent: add a ctdb_db context to the ctdb_persistent_state struct. persistent: allocate the persistent state in the ctdb_db struct in trans3_commit persistent: reject trans3_control when a commit is already active. persistent: add a client context to the persistent_stat and track the db_id daemon: correctly end a running trans3_commit if the client disconnects. persistent: add ctdb_persistent_finish_trans3_commits(). recover: finish pending trans3 commits when a recovery is finished. New version 1.2.22. gitignore: add vi swap files gitignore: add tags file tests: fix segfault in randrec test when connection to daemon fails. tests: fix segfault in fetch test when connection to ctdb failed. tests: fix segfault in fetch_one test when connection to ctdbd fails tests: fix segfault in store test when connection to ctdbd failed. Fix typos in a comment in vacuum_traverse. vacuum: in ctdb_vacuum_db, fix the length of the array of vacuum fetch lists vacuum: correctly send TRY_DELETE_RECORDS ctrl to all active nodes vacuum: reduce indentation of the loop sending VACUUM_FETCH controls vacuum: check lmaster against num_nodes instead of vnn_map->size server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag recoverd: in a recovery, set the MIGRATED_WITH_DATA flag on all records call: add new call flag CTDB_CALL_FLAG_VACUUM_MIGRATION call: Move definition of call flags down to the definition of the flags field. add a new record flag CTDB_REC_FLAG_VACUUM_MIGRATED. server: in the VACUUM_FETCH handler, add the VACUUM_MIGRAION to the call flags call: transfer the record flags in the ctdb call packets. call: hand the submitted record_flags to local record storage function. call: becoming dmaster in VACUUM_MIGRATION, set the VACUUM_MIGRATED record flag Add a delete_queue to the ctdb database context struct. When attaching to a non-persistent DB, initialize the delete_queue. vaccum: clear the fast-path vacuuming delete_queue after creating the vacuuming child. When wiping a database, clear the delete_queue. server: rename ctdb_repack_db() to ctdb_vacuum_and_repack_db() vacuum: refactor new add_record_to_vacuum_fetch_list() out of vacuum_traverse(). vacuum: skip adding records to list of records to send to lmaster on lmaster vacuum: refactor new add_record_to_delete_tree() out of vacuum_traverse(). vacuum: reduce indentation in add_record_to_delete_tree() vacuum: add delete_queue_traverse() for traversal of the delete_queue. vacuum: traverse the delete_queue befor traversing the database. Add a tunable VacuumFastPathCount. vacuum: add a fast_path_count to the vacuum_handle. vacuum: bump the number of fast-path runs in the vacuum child destructor vacuum: reset the fast path count in the event handle if it exceeds the limit. vacuum: Only run full vacuumig (db traverse) every VacuumFastPathCount times. vacuum: disable full db-traverse vacuuming runs when VacuumFastPathCount == 0 vacuum: change all Vacuum*Interval tunables to default to 10 vacuum: refactor insert_delete_record_data_into_tree() out of add_record_to_delete_tree() vacuum: add statistics output to the fast and full traverse runs. vacuum: lower level of hash collision debug message to INFO control: add macro CHECK_CONTROL_MIN_DATA_SIZE. control: add a new control opcode CTDB_CONTROL_SCHEDULE_FOR_DELETION server: implement a new control SCHEDULE_FOR_DELETION to fill the delete_queue. vacuum: add ctdb_local_schedule_for_deletion() client: add accessor function ctdb_header_from_record_handle(). test: send SCHEDULE_FOR_DELETION control from randrec test. daemon: fill ctdb->ctdbd_pid early server: create a server variant ctdb_ltdb_store_server() of ctdb_ltdb_store(). server: Use the ctdb_ltdb_store_server() in the ctdb daemon for non-persistent dbs ctdb_ltdb_store_server: delete an empty record that is safe to delete instead of storing locally. ctdb_ltdb_store_server: implement fastpath vacuuming deletion based on VACUUM_MIGRATED flag. ctdb_ltdb_store_server: always store the data when ctdb_ltdb_store() is called from the client ctdb_ltdb_store_server: Improve debug message in ctdb_ltdb_store when store or delete fails. ctdb_ltdb_store_server: add ability to send SCHEDULE_FOR_DELETION control to ctdb_ltdb_store. ctdb_private.h: add record flag CTDB_REC_FLAG_AUTOMATIC ltdb: add the CTDB_REC_FLAG_AUTOMATIC to the initial header in ctdb_ltdb_fetch() ctdb_ltdb_store_server: honour the AUTOMATIC record flag server: add a comment explaining the call redirect logic in ctdb_call_send_redirect(). vacuum: raise a debug level from INFO to DEBUG vacuum: refactor insert_record_into_delete_queue out of ctdb_control_schedule_for_deletion vacuum: use insert_record_into_delete_queue in ctdb_local_schedule_for_deletion. vacuum: fix a comment typo New version 1.2.24. Ronnie Sahlberg (209): Merge commit 'rusty/master' Add a code-style document. remove the "ctdb freeze" debugging command iupdate the docs that ctdb freeze is no more Merge remote branch 'martins/master' Update a log message to reflect that this does no longer only happen Create a new command "ctdb sync" that isd just an alias for "ctdb ipreallocate" Merge commit 'rusty/libctdb-new' into foo We use eventloop nesting in a couple of places, notably the sync update the example for the new signature of Add a new "ctdb addtickle" command to manually add tickles to ctdbd Remove the structure ctdb_control_tcp_vnn since this is identical to the structure ctdb_tcp_connection. Add machinereadable output for the "ctgdb gettickles <ip>" command On RHEL, "service nfs stop;service nfs start" and "service nfs restart" Merge commit 'rusty/vacuum-fix-master' Merge commit 'rusty/ports-from-1.0.112' into foo We need the deprecated talloc_append_string() for now Dont use the deprecated talloc_append_string() make it possible to "ctdb gettickle" to only list tickles for a certain add a new commandline flag -v to enable verbose output ctdb ip is very busy. Revert "tools/ctdb: add PartiallyOnline state for "ctdb status" and "ctdb status -Y"" Revert "version: generate RPM version from git" initial release for the 1.2 branch Dont set next_interval to 0. Dont set next_interval to 0. move the directives to build the devel file to the end of the specfile move the directives to build the devel file to the end of the specfile bump to -2 after fixing the specfile bug with wrong dependencies Add a command "ctdb pfetch <db> <record>" to read a record from get rid of two compiler warnings add a command to write a record to a persistent database Add a command "ctdb pfetch <db> <record>" to read a record from get rid of two compiler warnings add a command to write a record to a persistent database change "ctdb pfetch" to take an optional third argument change "ctdb pfetch" to take an optional third argument run the "init" event before we freeze the databases run the "init" event before we freeze the databases When "ctdb pfetch" creates a new file, make sure we set some initial sane mode bits When "ctdb pfetch" creates a new file, make sure we set some initial sane mode bits Dont initialize the domain socket for commands that do not require/use add a new command "ctdb tfetch" that can read a record straight out of the add a new command "ctdb tfetch" that can read a record straight out of the the tfetch command can be used without the daemon running, so flag it as such. the tfetch command can be used without the daemon running, so flag it as such. Add a configuration database, implemented as a persistent database. Add a configuration database, implemented as a persistent database. Add a new event "ipreallocated" Remove the dependency on the underlying cluster filesystem for handling remove the mention of a tickle and statd directory in shared storage now that we are removing these and migrating to store the data inside ctdbd or persistent databases Merge commit 'martins/master' into 1.2 we no longer have a 61.nfstickle script remove 61.nfstickles from the makefile new version 1.2.3 ouch, the ordering of the constants and the strings must be kept in sync ouch, remove a dummy debug printout that snuck in there somehow new version 1.2.4 dont print a lot of log information about shutting down vsftpd make sure all statd state directories exist before we try to reference them When memory allocations for recovery fails, Dont store temporary runtime data in $CTDB_BASE/state Change how NATGW is configured to allow special nodes that do not have Dont try to read the nodemap from the daemon for "ctdb listnodes" new version 1.2.5 remove an unused variable Implement a new function GETNODEMAP in libctdb. Add two new server types to the server_id structure. define and reserve a range of ctdb message ports for use by nfs and iscsi servers Update the comment for the range reserved for SAMBA and add a new serverid to send a message everytime an ip address is taken on the local node adda GETPUBLICIPS control to libctdb and use this in the test example set up a handler to catch and log debug messages from the tevent layer update/improve the log message related to rerecovery timeouts Add back monitoring for time skips, forward as well as backward. Create macros to update the statistics counters and use these macros Add a new statistics structure to keep the current running statistics Add rolling statistics that are collected across 10 second intervals. Create a tunable for how often to collect rolling statistics and initialize it to 1 second add a machinereadable version of ctdb stats/statistics when printing machinereadable statistics only print the header with the fieldnames once Dont log a normal vacuuming message about a missing record and using default vacuuming intervals as an error. get rid of the "ctdb setflags" command since change the hash function to use the much better Jenkins hash Spotted by rusty. dont stop checking interfaces after the first bond device Update the default hash size to be 100001 instead of 10000 Update latency countes to show min/max and average move extracting the config from config.tdb for public addresses New version 1.2.6 Make sure the statd directory exist before trying to access the Remove a debug message "Timed out waiting ..." try to restart NFS LOCKD if it failed to start If tdb_open() fails when trying to open the vacuuming database, remove checking for filesystems and filesystem health from the cnfs script. New version 1.2.7 Add support to create TDB databases using the new jenkins hash. new version 1.2.8 Drop the loglevel of the "reqid wrap" developer debug message to DEBUG Redirect the output from 00.ctdb pfetch to stdout. When shuttind down, we always unconditionally try to remove the natgw address during shutdown there is a window after we have stopped TCP and disconnected from all other nodes but before we have stopped all processing. Both nfs and nfslock scripts can fail under redhat in very rare situations. New version 1.2.9 when creating/adding a public ip, set the initial interface to be the first interface specified dont delete all ips from the system during the initial "init" event change the default for how long to waqit before dropping all ips to 120 seconds Add a new tunable : DisableIPFailover that when set to non 0 dont check the public ip assignment or if even we are hosting them and shouldnt when we load the public address file, at the same time check if we are already hosting the public address, if so, set ourselves up as the pnn for that address delay loading the public ip address file until after we have started the transport and discovered ouw own pnn number delete from old interface before adding to new interface this stuff is just so fragile that it will enter infinite recovery and fail loops Dont check remote ip allocation if public ip mgmt is disabled change the takeover script timeout to 9 seconds from 5 Dont exit the update ip function if the old and new interfaces are the same initialize the statistics to the current time, not start of epoch New version 1.2.10 add a new support function ctdb_check_counter_equal() Dont pollute the logs with a "file not found" message add an explicit _is_managed_service to iscsi eventscript update autostart/stop to work for samba new version 1.2.11 When we are no longer the natgw master, dont put the natgw ip on loopback. new version 1.2.12 dont try starting samba through the "init" event new version 1.2.13 during ip allocation, there are failure modes where a node might hold a ip address add a missing part of the import of the previous ganesha patch Add 60.ganesha to what gets installed by make install as well as by the RPM Remove LACOUNT and LACCESSOR and migrate the records immediately. change one of the reserved words in the ctdb ltdb header to be a flags field Add two new flags for the ltdb header. new version 1.2.14 add a new ctdb_ltdb function to delete a record in a normal database add new command line functions Add a new header flag for "migrated with data" and set this to 1 only run "serverid wipe" if we are actually running samba. LibCTDB libctdb When assigning the single-public-ip during startup, LVS ctdb addip: Revert "Add a new header flag for "migrated with data" and set this to 1" Revert "server: when we migrate off a record with data, set the MIGRATED_WITH_DATA flag" New version 1.2.15 50.samba Dont run net serverid wipe in the background 60.nfs 41.HTTPD New version 1.2.16 60.nfs Check if we have rpc.statd and if not, skip checking for statd Revert scheduling back to use real-time processes Add ctdb_fork(0 which will fork a child process and drop the real-time recoverd: avoid triggering a full recovery if just some ip allocation LIBCTDB uninitialized inqueue element STATD is 100027 not 1000247 TYPO IPALLOCATION : If the node is held pinned down in "init" state ADDIP failure We can not always rely on the recovery daemon pinging us in a timely manner LIBCTDB: add support for traverse change Christinas previous patch to only perform the check/logging New version 1.2.17 ctdb: hold transaction locks during freeze, mark during recover. TDB : Fix for a deadlock with transaction lock and lockall/lockallmark 60.nfs Add a new test tool that fetch locks a record and then blocks until it receives Compile fix LockWait congestion. If the node is stopped, put a log entry in /var/log/* to indicate this is why we never become ready New version 1.2.19 We default to non-deterministic ip now where ips are "sticky" and dont change New version 1.2.20 Dont allow client processes to attach to databases while we are still in recovery mode. Revert "Dont allow client processes to attach to databases while we are still in recovery mode." ctdb_req_dmaster from non-master New version 1.2.21 Deferred attach : at early startup, defer any db attach calls until we are out of recovery. ATTACH_DB: simplify the code slightly and change the semantics to only new version 1.2.23 If/when the recovery daemon terminates unexpectedly, try to restart it again from the main daemon instead of just shutting down the main deamon too. Restart recovery dameon if it looks like it hung. Dont allow clients to connect to databases untile we are well past and through Vacuuming: initialize a variable to avoid a harmless valgrind hit IP reallocation. If a public address is already hosted on the node when we startup, log a warning message but do not cause the recovery to fail. New version 1.2.25 Deferred attach: create the timed event as a child context of the da context we want to delete. new version 1.2.26 New version 1.2.27 This needs more testing first IFACE handling. Assume links are always good on nstartup (they almost always New version 1.2.27-2 bonding mode 4 monitoring: Dont exit from checking interfaces once we have found one interface that is not If samba fails to start for some reason, make this cause the startup event to fail too, so that ctdbd will re-try the startup event later. New version 1.2.27-3 renme to version 1.2.27-200 to leve some spce so we dont collide with Verify that state is not NULL before we dereference it in When using multiple VLANs, some funky stuff can sometimes happen when Remove all checking of GPFS from ctdb_diagnostics new version 1.2.27-201 Remove a benign by annoying log message that will be logged after an interface that has been in use has later been removed and is no longer referenced by any public addresses. New version 1.2.27-202 Remove logging of spam/errors from the 10.interfrace New version 1.2.27-203 Update the delip command new version 1.2.27-204 Dont call the UPDATE event if both old and new interface is the same. Rusty Russell (46): libctdb: removed unused lock field from struct ctdb_db libctdb: fix uninitialized field usage on ctdb_attach failure path libctdb: synchronous should be using ctdb_cancel to kill unfinished requests. libctdb: check ctdb_request_free & ctdb_cancel used appropriately. libctdb: ctdb_service() never returns < 0 libctdb: fix writerecord() to actually write the record. libctdb: fix io_elem resource leak on realloc failure. libctdb: implement ctdb_disconnect and ctdb_detachdb libctdb: implement synchronous readrecordlock interface. libctdb: test infrastructure libctdb: test: logging enhancement libctdb: test: improve logging of failure paths libctdb: test: --no-failtest libctdb: test: add database save and restore libctdb: test: add readrecordlock support libctdb: test: run.sh script ctdb: fix crash on "ctdb scriptstatus --events=releaseip" config: wrap iptables in flock to avoid concurrancy. libctdb: add synchronous message handling and unregister, with tests. tdb: fix short write logic in tdb_new_database tdb: remove unused variable in tdb_new_database(). tdb: Fix tdb_check() to work with read-only tdb databases. tdb: workaround starvation problem in locking entire database. talloc: update to 2.0.3 version from SAMBA event: Update events to latest Samba version 0.9.8 freeze: abort vacuuming when we're going to freeze. vacuum: fix crash on vacuum abort vacuum: disabling vacuuming during a freeze takeover: prevent crash by avoiding free in traverse on RST timeout logging: give a unique logging name to each forked child. idtree: fix right shift of signed ints, crash on large ids on AIX tdb: make check more robust against recovery failures. tdb: fix tdb_check() on read-only TDBs to actually work. tdb: fix tdb_check() on other-endian tdbs. tdb: put example hashes into header, so we notice incorrect hash_fn. tdb: increment version to 1.2.4 tdb: add Bob Jenkins lookup3 hash as helper hash. tdb: automatically identify Jenkins hash tdbs tdb: TDB_INCOMPATIBLE_HASH, to allow safe changing of default hash. tdb: fix non-WAF build, commit 1.2.6 ABI file. idtree: fix overflow for v. large ids on allocation and removal tdb: expose transaction lock infrastructure for ctdb ctdb_lockwait: create overflow queue. ctdbd: fix lock held on error ("ctdb_req_dmaster from non-master.") ctdbd: call tdb_reopen_all() in freeze child. eventscript: fix callback after free Stefan Metzmacher (3): config/interface_modify.sh: before calling a script check if it exists and is executable config/interface_modify.sh: do the echo before running the script events/10.interface: we need to mark interfaces as "up" if we don't know how to monitor them Volker Lendecke (2): Correctly set docdir tdb: add restore ----------------------------------------------------------------------- -- CTDB repository