Ganeti is currently not able to detect a legit shutdown request performed by a
user from inside a Xen domain.

This patch provides a design document to implement a mechanism able to cope with
such events.

Signed-off-by: Michele Tartara <[email protected]>
---
 Makefile.am                      |  1 +
 doc/design-draft.rst             |  1 +
 doc/design-internal-shutdown.rst | 72 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 74 insertions(+)
 create mode 100644 doc/design-internal-shutdown.rst

diff --git a/Makefile.am b/Makefile.am
index 037cf53..f66624e 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -410,6 +410,7 @@ docinput = \
        doc/design-htools-2.3.rst \
        doc/design-http-server.rst \
        doc/design-impexp2.rst \
+       doc/design-internal-shutdown.rst \
        doc/design-lu-generated-jobs.rst \
        doc/design-linuxha.rst \
        doc/design-multi-reloc.rst \
diff --git a/doc/design-draft.rst b/doc/design-draft.rst
index ccb2f93..9a1d2b1 100644
--- a/doc/design-draft.rst
+++ b/doc/design-draft.rst
@@ -19,6 +19,7 @@ Design document drafts
    design-storagetypes.rst
    design-reason-trail.rst
    design-device-uuid-name.rst
+   design-internal-shutdown.rst
 
 .. vim: set textwidth=72 :
 .. Local Variables:
diff --git a/doc/design-internal-shutdown.rst b/doc/design-internal-shutdown.rst
new file mode 100644
index 0000000..836d00c
--- /dev/null
+++ b/doc/design-internal-shutdown.rst
@@ -0,0 +1,72 @@
+============================================================
+Detection of user-initiated shutdown from inside an instance
+============================================================
+
+.. contents:: :depth: 2
+
+This is a design document detailing the implementation of a way for Ganeti to
+detect whether a machine marked as up but not running was shutdown gracefully
+by the user from inside the machine itself.
+
+Current state and shortcomings
+==============================
+
+Ganeti keeps track of the desired status of instances in order to be able to
+take proper actions (e.g.: reboot) on the ones that happen to crash.
+Currently, the only way to properly shut down a machine is through Ganeti's own
+commands, that will mark an instance as ``ADMIN_down``.
+If a user shuts down an instance from inside, through the proper command of the
+operating system it is running, the instance will be shutdown gracefully, but
+Ganeti is not aware of that: the desired status of the instance will still be
+marked as ``running``, so when the watcher realises that the instance is down,
+it will restart it. This behaviour is usually not what the user expects.
+
+Proposed changes
+================
+
+We propose to modify Ganeti in such a way that it will detect when an instance
+was shutdown because of an explicit user request. When such a situation is
+detected, the state of the instance will be set to ADMIN_down, as intended by
+the user.
+
+This design document applies to the Xen backend of Ganeti, because it uses
+features specific of such hypervisor.
+
+Implementation
+==============
+
+Xen knows why a domain is being shut down (a crash or an explicit shutdown
+or poweroff request), but such information is not usually readily available
+externally, because all such cases lead to the virtual machine being destroyed
+immediately after the event is detected.
+
+Still, Xen allows the instance configuration file to define what action to be
+taken in all those cases through the ``on_poweroff``, ``on_shutdown`` and
+``on_crash`` variables. By setting them to ``preserve``, Xen will avoid
+destroying the domains automatically.
+
+When the domain is not destroyed, it can be viewed by using ``xm list`` (or 
``xl
+list`` in newer Xen versions), and the ``State`` field of the output will
+provide useful information.
+
+If the state is ``----c-`` it means the instance has crashed.
+
+If the state is ``---s--`` it means the instance was properly shutdown.
+
+If the instance was properly shutdown and it is still marked as ``running`` by
+Ganeti, it means that it was shutdown from inside by the user, and the ganeti
+status of the instance needs to be changed to ``ADMIN_down``.
+
+This will be done at regular intervals by the group watcher, just before
+deciding which instances to reboot.
+
+On top of that, at the same times, the watcher will also need to issue ``xm
+destroy`` commands for all the domains that are in crashed or shutdown state,
+since this will not be done automatically by Xen anymore because of the
+``preserve`` setting in their config files.
+
+.. vim: set textwidth=72 :
+.. Local Variables:
+.. mode: rst
+.. fill-column: 72
+.. End:
-- 
1.8.2.1

Reply via email to