With jobs as processes, running jobs can interfer with cluster destruction: during the time the LUClusterDestroy job finishes and 'gnt-cluster destroy' really cleans up the cluster, they can start running again. To avoid this situation, we make WConfD take over the BGL so that parallel jobs do not do anything after the LUClusterDestroy job finished; they will clean up themselfes, once their job file is deleted.
To make things worth, the watcher may start any stopped daemons. So, to avoid wconfd starting again after it was terminated by 'gnt-cluster destroy' we also render the cluster into a no-master state; in this way, wconfd will not win the voting and hence no restart. Klaus Aehlig (10): Make LuxiD clean up its lock file Make job processes keep track of their job id Keep track of the number LUs executing Detect if the own job file disappears Add a prefix for a WConfD livelock Make WConfD have a livelock file as well WConfD: do not clean up own livelock Support no-master state in ssconf Add an RPC to prepare cluster destruction Make LUClusterDestroy tell WConfD lib/cmdlib/cluster.py | 2 ++ lib/jqueue/__init__.py | 17 +++++++++++ lib/mcpu.py | 3 ++ src/Ganeti/Constants.hs | 4 +++ src/Ganeti/Query/Server.hs | 2 +- src/Ganeti/WConfd/Core.hs | 47 +++++++++++++++++++++++++++++-- src/Ganeti/WConfd/DeathDetection.hs | 9 ++++-- src/Ganeti/WConfd/Monad.hs | 8 +++++- src/Ganeti/WConfd/Server.hs | 5 +++- src/Ganeti/WConfd/Ssconf.hs | 2 +- test/py/cmdlib/testsupport/wconfd_mock.py | 3 ++ 11 files changed, 94 insertions(+), 8 deletions(-) -- 2.2.0.rc0.207.ga3a616c
