Re: [PATCH master] Design doc for internal shutdown detection

Michele Tartara Wed, 22 May 2013 04:35:59 -0700

On Wed, May 22, 2013 at 1:30 PM, Guido Trotter <[email protected]> wrote:


>
>
>
> On Wed, May 22, 2013 at 1:25 PM, Michele Tartara <[email protected]>wrote:
>
>> On Wed, May 22, 2013 at 1:07 PM, Guido Trotter <[email protected]>wrote:
>>
>>> Ack thanks.
>>>
>>> This introduced a race condition in instance start then, but there's
>>> nothing we can do about it, except documenting it, I guess.
>>>
>>>
>> Why in instance start? When you are starting an instance, either the
>> instance is not running (and everything is fine), or it is in the preserved
>> state (and therefore it's first cleaned and then started again).
>>
>> If it is running already, it will be detected as such, and the start job
>> will not run. If after that the instance is shutdown, I don't think this
>> introduces any race condition.
>>
>>
> Ack, true. Well, this anyway sounds sensible, so let's update the design
> and see where we get.
>
>
Interdiff:

diff --git a/doc/design-internal-shutdown.rst
b/doc/design-internal-shutdown.rst
index 8b5e3c3..8b4d0fb 100644
--- a/doc/design-internal-shutdown.rst
+++ b/doc/design-internal-shutdown.rst
@@ -84,14 +84,12 @@ that only query the state of instances will not run the
cleanup function.
 The cleanup operation includes both node-specific operations (the actual
 destruction of the stopped domains) and configuration changes, to be
performed
 on the master node (marking as offline an instance that was shut down
-internally). Therefore, it will be implemented by adding a LU in cmdlib
-(``LUCleanupInstances``). A Job executing such an opcode will be submitted
by
-the watcher to perform the cleanup.
-
-The node daemon will have to be modified in order to support at least the
-following RPC calls:
- * A call to list all the instances that have been shutdown from inside
- * A call to destroy a domain
+internally). The watcher (that runs on every node) will be able to detect
the
+instances that have been shutdown from inside by directly querying the
+hypervisor. It will then submit to the master node a series of
+``InstanceShutdown`` jobs that will mark such instances as ``ADMIN_down``
+and clean them up (after the functionality of ``InstanceShutdown`` will
have
+been extended as specified in this design document).

 The base hypervisor class (and all the deriving classes) will need two
methods
 for implementing such functionalities in a hypervisor-specific way.
@@ -107,16 +105,20 @@ Other required changes
 The implementation of this design document will require some commands to be
 changed in order to cope with the new shutdown procedure.

-With this modification, also the Ganeti command for shutting down instances
-would leave them in a shutdown but preserved state. Therefore, it will be
-changed in such a way to immediately perform the cleanup of the instance
-after verifying its correct shutdown.
+With the default shutdown action in Xen set to ``preserve``, the Ganeti
+command for shutting down instances would leave them in a shutdown but
+preserved state. Therefore, it will have to be changed in such a way to
+immediately perform the cleanup of the instance after verifying its correct
+shutdown. Also, it will correctly deal with instances that have been
shutdown
+from inside but are still active according to Ganeti, by detecting this
+situation, destroying the instance and carrying out the rest of the Ganeti
+shutdown procedure as usual.

 The ``gnt-instance list`` command will need to be able to handle the
situation
 where an instance was shutdown internally but not yet cleaned up.
-The admin_state field will maintain the current meaning unchanged. The
-oper_state field will get a new possible state, ``S``, meaning that the
instance
-was shutdown internally.
+The ``admin_state`` field will maintain the current meaning unchanged. The
+``oper_state`` field will get a new possible state, ``S``, meaning that the
+instance was shutdown internally.

 The ``gnt-instance info`` command ``State`` field, in such case, will show
a
 message stating that the instance was supposed to be run but was shut down


Thanks,
Michele

Re: [PATCH master] Design doc for internal shutdown detection

Reply via email to