On 17 May 2013 19:30, Michele Tartara <[email protected]> wrote:
> On Fri, May 17, 2013 at 6:05 PM, Bernardo Dal Seno <[email protected]>
> wrote:
>>
>> On 17 May 2013 11:46, Michele Tartara <[email protected]> wrote:
>> > Ganeti is currently not able to detect a legit shutdown request
>> > performed by a
>> > user from inside a Xen domain.
>> >
>> > This patch provides a design document to implement a mechanism able to
>> > cope with
>> > such events.
>> >
>> > Signed-off-by: Michele Tartara <[email protected]>
>> > ---
>> >  Makefile.am                      |  1 +
>> >  doc/design-draft.rst             |  1 +
>> >  doc/design-internal-shutdown.rst | 72
>> > ++++++++++++++++++++++++++++++++++++++++
>> >  3 files changed, 74 insertions(+)
>> >  create mode 100644 doc/design-internal-shutdown.rst
>> >
>> > diff --git a/Makefile.am b/Makefile.am
>> > index 037cf53..f66624e 100644
>> > --- a/Makefile.am
>> > +++ b/Makefile.am
>> > @@ -410,6 +410,7 @@ docinput = \
>> >         doc/design-htools-2.3.rst \
>> >         doc/design-http-server.rst \
>> >         doc/design-impexp2.rst \
>> > +       doc/design-internal-shutdown.rst \
>> >         doc/design-lu-generated-jobs.rst \
>> >         doc/design-linuxha.rst \
>> >         doc/design-multi-reloc.rst \
>> > diff --git a/doc/design-draft.rst b/doc/design-draft.rst
>> > index ccb2f93..9a1d2b1 100644
>> > --- a/doc/design-draft.rst
>> > +++ b/doc/design-draft.rst
>> > @@ -19,6 +19,7 @@ Design document drafts
>> >     design-storagetypes.rst
>> >     design-reason-trail.rst
>> >     design-device-uuid-name.rst
>> > +   design-internal-shutdown.rst
>> >
>> >  .. vim: set textwidth=72 :
>> >  .. Local Variables:
>> > diff --git a/doc/design-internal-shutdown.rst
>> > b/doc/design-internal-shutdown.rst
>> > new file mode 100644
>> > index 0000000..836d00c
>> > --- /dev/null
>> > +++ b/doc/design-internal-shutdown.rst
>> > @@ -0,0 +1,72 @@
>> > +============================================================
>> > +Detection of user-initiated shutdown from inside an instance
>> > +============================================================
>> > +
>> > +.. contents:: :depth: 2
>> > +
>> > +This is a design document detailing the implementation of a way for
>> > Ganeti to
>> > +detect whether a machine marked as up but not running was shutdown
>> > gracefully
>> > +by the user from inside the machine itself.
>> > +
>> > +Current state and shortcomings
>> > +==============================
>> > +
>> > +Ganeti keeps track of the desired status of instances in order to be
>> > able to
>> > +take proper actions (e.g.: reboot) on the ones that happen to crash.
>> > +Currently, the only way to properly shut down a machine is through
>> > Ganeti's own
>> > +commands, that will mark an instance as ``ADMIN_down``.
>> > +If a user shuts down an instance from inside, through the proper
>> > command of the
>> > +operating system it is running, the instance will be shutdown
>> > gracefully, but
>> > +Ganeti is not aware of that: the desired status of the instance will
>> > still be
>> > +marked as ``running``, so when the watcher realises that the instance
>> > is down,
>> > +it will restart it. This behaviour is usually not what the user
>> > expects.
>> > +
>> > +Proposed changes
>> > +================
>> > +
>> > +We propose to modify Ganeti in such a way that it will detect when an
>> > instance
>> > +was shutdown because of an explicit user request. When such a situation
>> > is
>> > +detected, the state of the instance will be set to ADMIN_down, as
>> > intended by
>> > +the user.
>> > +
>> > +This design document applies to the Xen backend of Ganeti, because it
>> > uses
>> > +features specific of such hypervisor.
>> > +
>> > +Implementation
>> > +==============
>> > +
>> > +Xen knows why a domain is being shut down (a crash or an explicit
>> > shutdown
>> > +or poweroff request), but such information is not usually readily
>> > available
>> > +externally, because all such cases lead to the virtual machine being
>> > destroyed
>> > +immediately after the event is detected.
>> > +
>> > +Still, Xen allows the instance configuration file to define what action
>> > to be
>> > +taken in all those cases through the ``on_poweroff``, ``on_shutdown``
>> > and
>> > +``on_crash`` variables. By setting them to ``preserve``, Xen will avoid
>> > +destroying the domains automatically.
>> > +
>> > +When the domain is not destroyed, it can be viewed by using ``xm list``
>> > (or ``xl
>> > +list`` in newer Xen versions), and the ``State`` field of the output
>> > will
>> > +provide useful information.
>> > +
>> > +If the state is ``----c-`` it means the instance has crashed.
>> > +
>> > +If the state is ``---s--`` it means the instance was properly shutdown.
>> > +
>> > +If the instance was properly shutdown and it is still marked as
>> > ``running`` by
>> > +Ganeti, it means that it was shutdown from inside by the user, and the
>> > ganeti
>> > +status of the instance needs to be changed to ``ADMIN_down``.
>> > +
>> > +This will be done at regular intervals by the group watcher, just
>> > before
>> > +deciding which instances to reboot.
>> > +
>> > +On top of that, at the same times, the watcher will also need to issue
>> > ``xm
>> > +destroy`` commands for all the domains that are in crashed or shutdown
>> > state,
>> > +since this will not be done automatically by Xen anymore because of the
>> > +``preserve`` setting in their config files.
>>
>> I think that that should be done also by gnt-instance start and
>> similar commands, as they could be issued before the watcher runs.
>>
>> Also, what happens to output of gnt-instance list? Will it be correct?
>>
> Read my reply to Guido's emails and you'll find the answer to your
> questions. :-)

If only I found them. But I guess I'll wait for the revised doc. :-)

Bernardo

>
> Thanks for pointing it out, though.
> I'll soon send a revised design doc containing those clarifications.
>
> Thanks,
> Michele

Reply via email to