[PATCH stable-2.12 00/10] Avoid races in cluster destruction

'Klaus Aehlig' via ganeti-devel Mon, 23 Mar 2015 08:47:18 -0700

With jobs as processes, running jobs can interfer with cluster
destruction: during the time the LUClusterDestroy job finishes
and 'gnt-cluster destroy' really cleans up the cluster, they
can start running again. To avoid this situation, we make
WConfD take over the BGL so that parallel jobs do not do anything
after the LUClusterDestroy job finished; they will clean up themselfes,
once their job file is deleted.


To make things worth, the watcher may start any stopped daemons.
So, to avoid wconfd starting again after it was terminated by
'gnt-cluster destroy' we also render the cluster into a no-master
state; in this way, wconfd will not win the voting and hence no
restart.

Klaus Aehlig (10):
  Make LuxiD clean up its lock file
  Make job processes keep track of their job id
  Keep track of the number LUs executing
  Detect if the own job file disappears
  Add a prefix for a WConfD livelock
  Make WConfD have a livelock file as well
  WConfD: do not clean up own livelock
  Support no-master state in ssconf
  Add an RPC to prepare cluster destruction
  Make LUClusterDestroy tell WConfD

 lib/cmdlib/cluster.py                     |  2 ++
 lib/jqueue/__init__.py                    | 17 +++++++++++
 lib/mcpu.py                               |  3 ++
 src/Ganeti/Constants.hs                   |  4 +++
 src/Ganeti/Query/Server.hs                |  2 +-
 src/Ganeti/WConfd/Core.hs                 | 47 +++++++++++++++++++++++++++++--
 src/Ganeti/WConfd/DeathDetection.hs       |  9 ++++--
 src/Ganeti/WConfd/Monad.hs                |  8 +++++-
 src/Ganeti/WConfd/Server.hs               |  5 +++-
 src/Ganeti/WConfd/Ssconf.hs               |  2 +-
 test/py/cmdlib/testsupport/wconfd_mock.py |  3 ++
 11 files changed, 94 insertions(+), 8 deletions(-)

-- 
2.2.0.rc0.207.ga3a616c

[PATCH stable-2.12 00/10] Avoid races in cluster destruction

Reply via email to