On Sat, May 18, 2013 at 10:07 AM, Michele Tartara <[email protected]>wrote:

> +list
>
>
>
> On Fri, May 17, 2013 at 4:26 PM, Michele Tartara <[email protected]>wrote:
>
>> On Fri, May 17, 2013 at 2:50 PM, Guido Trotter <[email protected]>wrote:
>>
>>>
>>>
>>>
>>> On Fri, May 17, 2013 at 2:30 PM, Michele Tartara <[email protected]>wrote:
>>>
>>>> On Fri, May 17, 2013 at 10:54 AM, Guido Trotter 
>>>> <[email protected]>wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, May 17, 2013 at 10:46 AM, Michele Tartara <[email protected]
>>>>> > wrote:
>>>>>
>>>>>> Ganeti is currently not able to detect a legit shutdown request
>>>>>> performed by a
>>>>>> user from inside a Xen domain.
>>>>>>
>>>>>> This patch provides a design document to implement a mechanism able
>>>>>> to cope with
>>>>>> such events.
>>>>>>
>>>>>> Signed-off-by: Michele Tartara <[email protected]>
>>>>>> ---
>>>>>>  Makefile.am                      |  1 +
>>>>>>  doc/design-draft.rst             |  1 +
>>>>>>  doc/design-internal-shutdown.rst | 72
>>>>>> ++++++++++++++++++++++++++++++++++++++++
>>>>>>  3 files changed, 74 insertions(+)
>>>>>>  create mode 100644 doc/design-internal-shutdown.rst
>>>>>>
>>>>>> diff --git a/Makefile.am b/Makefile.am
>>>>>> index 037cf53..f66624e 100644
>>>>>> --- a/Makefile.am
>>>>>> +++ b/Makefile.am
>>>>>> @@ -410,6 +410,7 @@ docinput = \
>>>>>>         doc/design-htools-2.3.rst \
>>>>>>         doc/design-http-server.rst \
>>>>>>         doc/design-impexp2.rst \
>>>>>> +       doc/design-internal-shutdown.rst \
>>>>>>         doc/design-lu-generated-jobs.rst \
>>>>>>         doc/design-linuxha.rst \
>>>>>>         doc/design-multi-reloc.rst \
>>>>>> diff --git a/doc/design-draft.rst b/doc/design-draft.rst
>>>>>> index ccb2f93..9a1d2b1 100644
>>>>>> --- a/doc/design-draft.rst
>>>>>> +++ b/doc/design-draft.rst
>>>>>> @@ -19,6 +19,7 @@ Design document drafts
>>>>>>     design-storagetypes.rst
>>>>>>     design-reason-trail.rst
>>>>>>     design-device-uuid-name.rst
>>>>>> +   design-internal-shutdown.rst
>>>>>>
>>>>>>  .. vim: set textwidth=72 :
>>>>>>  .. Local Variables:
>>>>>> diff --git a/doc/design-internal-shutdown.rst
>>>>>> b/doc/design-internal-shutdown.rst
>>>>>> new file mode 100644
>>>>>> index 0000000..836d00c
>>>>>> --- /dev/null
>>>>>> +++ b/doc/design-internal-shutdown.rst
>>>>>> @@ -0,0 +1,72 @@
>>>>>> +============================================================
>>>>>> +Detection of user-initiated shutdown from inside an instance
>>>>>> +============================================================
>>>>>> +
>>>>>> +.. contents:: :depth: 2
>>>>>> +
>>>>>> +This is a design document detailing the implementation of a way for
>>>>>> Ganeti to
>>>>>> +detect whether a machine marked as up but not running was shutdown
>>>>>> gracefully
>>>>>> +by the user from inside the machine itself.
>>>>>> +
>>>>>> +Current state and shortcomings
>>>>>> +==============================
>>>>>> +
>>>>>> +Ganeti keeps track of the desired status of instances in order to be
>>>>>> able to
>>>>>> +take proper actions (e.g.: reboot) on the ones that happen to crash.
>>>>>> +Currently, the only way to properly shut down a machine is through
>>>>>> Ganeti's own
>>>>>> +commands, that will mark an instance as ``ADMIN_down``.
>>>>>> +If a user shuts down an instance from inside, through the proper
>>>>>> command of the
>>>>>> +operating system it is running, the instance will be shutdown
>>>>>> gracefully, but
>>>>>> +Ganeti is not aware of that: the desired status of the instance will
>>>>>> still be
>>>>>> +marked as ``running``, so when the watcher realises that the
>>>>>> instance is down,
>>>>>> +it will restart it. This behaviour is usually not what the user
>>>>>> expects.
>>>>>> +
>>>>>> +Proposed changes
>>>>>> +================
>>>>>> +
>>>>>> +We propose to modify Ganeti in such a way that it will detect when
>>>>>> an instance
>>>>>> +was shutdown because of an explicit user request. When such a
>>>>>> situation is
>>>>>> +detected, the state of the instance will be set to ADMIN_down, as
>>>>>> intended by
>>>>>> +the user.
>>>>>> +
>>>>>>
>>>>>
>>>>> Should we provide an option to just restart it in that case? (as an hv
>>>>> parameter)
>>>>> The default could still be to leave it down.
>>>>>
>>>>
>>>> Do you mean to have an option to tell the system to behave as it does
>>>> currently? Is that really useful?
>>>>
>>>>
>>> Yes. One thing is "we don't want these to appear as errors, like if the
>>> instance have crashed", and a different one is "we're ok if they don't come
>>> back up". That said, perhaps if the user does "halt" it's ok to halt, and
>>> if they do reboot to reboot. So that should be the default.
>>>
>>
>> Ack.
>>
>>
>>>
>>>
>>>>
>>>>>
>>>>>> +This design document applies to the Xen backend of Ganeti, because
>>>>>> it uses
>>>>>> +features specific of such hypervisor.
>>>>>> +
>>>>>>
>>>>>
>>>>> Is there any way to do something similar at least for kvm? I think
>>>>> this is worth investigating.
>>>>>
>>>>
>>>> I had a brief look, and unfortunately it seems like it's not possible.
>>>> But I'll spend some more time trying to do it.
>>>>
>>>
>>> I see a -no-shutdown option in qemu. Not sure what kind of information
>>> the monitor gives, in that case.
>>>
>>
>> Thanks for the hint, I'll have a look at it.
>>
>>
>>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>> +Implementation
>>>>>> +==============
>>>>>> +
>>>>>> +Xen knows why a domain is being shut down (a crash or an explicit
>>>>>> shutdown
>>>>>> +or poweroff request), but such information is not usually readily
>>>>>> available
>>>>>> +externally, because all such cases lead to the virtual machine being
>>>>>> destroyed
>>>>>> +immediately after the event is detected.
>>>>>> +
>>>>>> +Still, Xen allows the instance configuration file to define what
>>>>>> action to be
>>>>>> +taken in all those cases through the ``on_poweroff``,
>>>>>> ``on_shutdown`` and
>>>>>> +``on_crash`` variables. By setting them to ``preserve``, Xen will
>>>>>> avoid
>>>>>> +destroying the domains automatically.
>>>>>> +
>>>>>> +When the domain is not destroyed, it can be viewed by using ``xm
>>>>>> list`` (or ``xl
>>>>>> +list`` in newer Xen versions), and the ``State`` field of the output
>>>>>> will
>>>>>> +provide useful information.
>>>>>> +
>>>>>> +If the state is ``----c-`` it means the instance has crashed.
>>>>>> +
>>>>>> +If the state is ``---s--`` it means the instance was properly
>>>>>> shutdown.
>>>>>> +
>>>>>> +If the instance was properly shutdown and it is still marked as
>>>>>> ``running`` by
>>>>>> +Ganeti, it means that it was shutdown from inside by the user, and
>>>>>> the ganeti
>>>>>> +status of the instance needs to be changed to ``ADMIN_down``.
>>>>>> +
>>>>>> +This will be done at regular intervals by the group watcher, just
>>>>>> before
>>>>>> +deciding which instances to reboot.
>>>>>
>>>>> +
>>>>>> +On top of that, at the same times, the watcher will also need to
>>>>>> issue ``xm
>>>>>> +destroy`` commands for all the domains that are in crashed or
>>>>>> shutdown state,
>>>>>> +since this will not be done automatically by Xen anymore because of
>>>>>> the
>>>>>> +``preserve`` setting in their config files.
>>>>>> +
>>>>>>
>>>>>
>>>>> Does this mean the memory will be freed only 5 minutes later?
>>>>>
>>>>
>>>> Yes, that's the downside of this approach.
>>>> Still, I think the impact is quite limited. First, because I don't
>>>> think that shutting down VMs from inside happens so frequently (but we
>>>> probably need some data to confirm this), and 5 minutes is the upper limit
>>>> anyway. Second, because if we see that it is required, we could have the
>>>> cleanup function running not only in the watcher, but also "on demand"every
>>>> time it is needed, such as before creating a new instance, so that all the
>>>> operations that require memory to be available are sure to be executed in a
>>>> clean state.
>>>>
>>>>
>>> Ack.
>>>
>>
>>>
>>>>  Can you add a note saying that this will not impact the "ganeti
>>>>> admin" issued path, as that can check "xm list" and will do "the right
>>>>> thing" finishing with the destruction?
>>>>>
>>>>
>>>> Sure, this is not a problem.
>>>>
>>>>  Also, what will happen when gnt-instance list is called beteween the
>>>>> shutdown and the watcher run? Do we need to display a different state, 
>>>>> make
>>>>> the cleanup happen then (not sure I like this), or just display ADMIN_Down
>>>>> "as if", although the domain is not destroyed yet?
>>>>>
>>>>
>>>> Given that gnt-instance list shows the status of instances according to
>>>> Ganeti, I think the right thing would be to show the exact situation: the
>>>> expected status, the actual status, and a note saying that an internal
>>>> shutdown was detected.
>>>>
>>>>
>>> Ack, but this needs to be explicit in this design. What exactly is going
>>> to appear in the "status" field? What about in oper_state and admin_state ?
>>> Is there any new field you're adding? (I assume not, but this is important
>>> to specify).
>>>
>>
>> Ok, I'll detail all of this.
>>
>>
>>>
>>>
>>>
>>>>
>>>>> Finally, can you comment on changes to gnt-instance
>>>>> start/failover/migrate and other ops, when they encounter an instance in
>>>>> this state? Will the RPC layer be changed to make this transparent to them
>>>>> (how?) or will they need to deal with this explicitly?
>>>>>
>>>>
>>>> My idea is to run the cleanup function for a single instance just
>>>> before executing commands that would affect the running state of that
>>>> instance. This way, we'd end up being sure we always are in a clean state
>>>> before performing an operation.
>>>> I don't see a need to modify the RPC layer: it seems to me that this is
>>>> an hypervisor specific think, and could therefore be implemented directly
>>>> in the affected hypervisor-specific functions.
>>>>
>>>>
>>> Ack, but please add this to the design doc, specifying that it is only
>>> for functions that perform action on an instance, and not for ones who just
>>> query it (if this is the case as I assume it is).
>>>
>>
>> Ok. Yes, of course, that is the case.
>>
>> Thanks,
>> Michele
>>
>
>
Interdiff:
diff --git a/doc/design-internal-shutdown.rst
b/doc/design-internal-shutdown.rst
index 836d00c..a92d837 100644
--- a/doc/design-internal-shutdown.rst
+++ b/doc/design-internal-shutdown.rst
 We propose to modify Ganeti in such a way that it will detect when an
instance
 was shutdown because of an explicit user request. When such a situation is
-detected, the state of the instance will be set to ADMIN_down, as intended
by
-the user.
+detected, instead of presenting an error as it happens now, either the
state
+of the instance will be set to ADMIN_down, or the instance will be
+automatically rebooted, depending on a instance-specific configuration
value.
+The default behavior in case no such parameter is found will be to follow
+the apparent will of the user, and setting to ADMIN_down an instance that
+was shut down correctly from inside.

 This design document applies to the Xen backend of Ganeti, because it uses
-features specific of such hypervisor.
+features specific of such hypervisor. Initial analysis suggests that a
similar
+approach might be used for KVM as well, so this design document will be
later
+extended to add more details about it.

 Implementation
 ==============
@@ -65,6 +71,40 @@ destroy`` commands for all the domains that are in
crashed or shutdown state,
 since this will not be done automatically by Xen anymore because of the
 ``preserve`` setting in their config files.

+This behavior will be limited to the domains shut down from inside,
because it
+will actually keep the resources of the domain busy until the watcher will
do
+the cleaning job (that, with the default setting, is up to every 5
minutes).
+Still, this is considered acceptable, because it is not frequent for a
domain
+to be shut down this way. The cleanup function will be also run
+automatically just before performing any job that requires resources to be
+available (such as when creating a new instance), in order to ensure that
the
+new resource allocation happens starting from a clean state.
Functionalities
+that only query the state of instances will not run the cleanup function.
+
+This changes are hypervisor-specific and will not affect the external RPC
+interface.
+
+Side effects of the modification
+++++++++++++++++++++++++++++++++
+
+The implementation of this design document will require some commands to be
+changed in order to cope with the new shutdown procedure.
+
+With this modification, also the Ganeti command for shutting down instances
+would leave them in a shutdown but preserved state. Therefore, it will be
+changed in such a way to immediately perform the cleanup of the instance
+after verifying its correct shutdown.
+
+The ``gnt-instance list`` command will need to be able to handle the
situation
+where an instance was shutdown internally but not yet cleaned up.
+The admin_state field will maintain the current meaning unchanged. The
+oper_state field will get a new possible state, ``S``, meaning that the
instance
+was shutdown internally.
+
+The ``gnt-instance info`` command ``State`` field, in such case, will show
a
+message stating that the instance was supposed to be run but was shut down
+internally.
+
 .. vim: set textwidth=72 :
 .. Local Variables:
 .. mode: rst


Cheers,
Michele

Reply via email to