On 17 May 2013 11:46, Michele Tartara <[email protected]> wrote:
> Ganeti is currently not able to detect a legit shutdown request performed by a
> user from inside a Xen domain.
>
> This patch provides a design document to implement a mechanism able to cope 
> with
> such events.
>
> Signed-off-by: Michele Tartara <[email protected]>
> ---
>  Makefile.am                      |  1 +
>  doc/design-draft.rst             |  1 +
>  doc/design-internal-shutdown.rst | 72 
> ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 74 insertions(+)
>  create mode 100644 doc/design-internal-shutdown.rst
>
> diff --git a/Makefile.am b/Makefile.am
> index 037cf53..f66624e 100644
> --- a/Makefile.am
> +++ b/Makefile.am
> @@ -410,6 +410,7 @@ docinput = \
>         doc/design-htools-2.3.rst \
>         doc/design-http-server.rst \
>         doc/design-impexp2.rst \
> +       doc/design-internal-shutdown.rst \
>         doc/design-lu-generated-jobs.rst \
>         doc/design-linuxha.rst \
>         doc/design-multi-reloc.rst \
> diff --git a/doc/design-draft.rst b/doc/design-draft.rst
> index ccb2f93..9a1d2b1 100644
> --- a/doc/design-draft.rst
> +++ b/doc/design-draft.rst
> @@ -19,6 +19,7 @@ Design document drafts
>     design-storagetypes.rst
>     design-reason-trail.rst
>     design-device-uuid-name.rst
> +   design-internal-shutdown.rst
>
>  .. vim: set textwidth=72 :
>  .. Local Variables:
> diff --git a/doc/design-internal-shutdown.rst 
> b/doc/design-internal-shutdown.rst
> new file mode 100644
> index 0000000..836d00c
> --- /dev/null
> +++ b/doc/design-internal-shutdown.rst
> @@ -0,0 +1,72 @@
> +============================================================
> +Detection of user-initiated shutdown from inside an instance
> +============================================================
> +
> +.. contents:: :depth: 2
> +
> +This is a design document detailing the implementation of a way for Ganeti to
> +detect whether a machine marked as up but not running was shutdown gracefully
> +by the user from inside the machine itself.
> +
> +Current state and shortcomings
> +==============================
> +
> +Ganeti keeps track of the desired status of instances in order to be able to
> +take proper actions (e.g.: reboot) on the ones that happen to crash.
> +Currently, the only way to properly shut down a machine is through Ganeti's 
> own
> +commands, that will mark an instance as ``ADMIN_down``.
> +If a user shuts down an instance from inside, through the proper command of 
> the
> +operating system it is running, the instance will be shutdown gracefully, but
> +Ganeti is not aware of that: the desired status of the instance will still be
> +marked as ``running``, so when the watcher realises that the instance is 
> down,
> +it will restart it. This behaviour is usually not what the user expects.
> +
> +Proposed changes
> +================
> +
> +We propose to modify Ganeti in such a way that it will detect when an 
> instance
> +was shutdown because of an explicit user request. When such a situation is
> +detected, the state of the instance will be set to ADMIN_down, as intended by
> +the user.
> +
> +This design document applies to the Xen backend of Ganeti, because it uses
> +features specific of such hypervisor.
> +
> +Implementation
> +==============
> +
> +Xen knows why a domain is being shut down (a crash or an explicit shutdown
> +or poweroff request), but such information is not usually readily available
> +externally, because all such cases lead to the virtual machine being 
> destroyed
> +immediately after the event is detected.
> +
> +Still, Xen allows the instance configuration file to define what action to be
> +taken in all those cases through the ``on_poweroff``, ``on_shutdown`` and
> +``on_crash`` variables. By setting them to ``preserve``, Xen will avoid
> +destroying the domains automatically.
> +
> +When the domain is not destroyed, it can be viewed by using ``xm list`` (or 
> ``xl
> +list`` in newer Xen versions), and the ``State`` field of the output will
> +provide useful information.
> +
> +If the state is ``----c-`` it means the instance has crashed.
> +
> +If the state is ``---s--`` it means the instance was properly shutdown.
> +
> +If the instance was properly shutdown and it is still marked as ``running`` 
> by
> +Ganeti, it means that it was shutdown from inside by the user, and the ganeti
> +status of the instance needs to be changed to ``ADMIN_down``.
> +
> +This will be done at regular intervals by the group watcher, just before
> +deciding which instances to reboot.
> +
> +On top of that, at the same times, the watcher will also need to issue ``xm
> +destroy`` commands for all the domains that are in crashed or shutdown state,
> +since this will not be done automatically by Xen anymore because of the
> +``preserve`` setting in their config files.

I think that that should be done also by gnt-instance start and
similar commands, as they could be issued before the watcher runs.

Also, what happens to output of gnt-instance list? Will it be correct?

Bernardo

> +
> +.. vim: set textwidth=72 :
> +.. Local Variables:
> +.. mode: rst
> +.. fill-column: 72
> +.. End:
> --
> 1.8.2.1
>

Reply via email to