On Wed, May 22, 2013 at 1:35 PM, Michele Tartara <[email protected]>wrote:

> On Wed, May 22, 2013 at 1:30 PM, Guido Trotter <[email protected]>wrote:
>
>>
>>
>>
>> On Wed, May 22, 2013 at 1:25 PM, Michele Tartara <[email protected]>wrote:
>>
>>> On Wed, May 22, 2013 at 1:07 PM, Guido Trotter <[email protected]>wrote:
>>>
>>>> Ack thanks.
>>>>
>>>> This introduced a race condition in instance start then, but there's
>>>> nothing we can do about it, except documenting it, I guess.
>>>>
>>>>
>>> Why in instance start? When you are starting an instance, either the
>>> instance is not running (and everything is fine), or it is in the preserved
>>> state (and therefore it's first cleaned and then started again).
>>>
>>> If it is running already, it will be detected as such, and the start job
>>> will not run. If after that the instance is shutdown, I don't think this
>>> introduces any race condition.
>>>
>>>
>> Ack, true. Well, this anyway sounds sensible, so let's update the design
>> and see where we get.
>>
>>
> Interdiff:
>
> diff --git a/doc/design-internal-shutdown.rst
> b/doc/design-internal-shutdown.rst
> index 8b5e3c3..8b4d0fb 100644
> --- a/doc/design-internal-shutdown.rst
> +++ b/doc/design-internal-shutdown.rst
> @@ -84,14 +84,12 @@ that only query the state of instances will not run
> the cleanup function.
>  The cleanup operation includes both node-specific operations (the actual
>  destruction of the stopped domains) and configuration changes, to be
> performed
>  on the master node (marking as offline an instance that was shut down
> -internally). Therefore, it will be implemented by adding a LU in cmdlib
> -(``LUCleanupInstances``). A Job executing such an opcode will be
> submitted by
> -the watcher to perform the cleanup.
> -
> -The node daemon will have to be modified in order to support at least the
> -following RPC calls:
> - * A call to list all the instances that have been shutdown from inside
> - * A call to destroy a domain
> +internally). The watcher (that runs on every node) will be able to detect
> the
> +instances that have been shutdown from inside by directly querying the
> +hypervisor. It will then submit to the master node a series of
> +``InstanceShutdown`` jobs that will mark such instances as ``ADMIN_down``
> +and clean them up (after the functionality of ``InstanceShutdown`` will
> have
> +been extended as specified in this design document).
>

Actually I think you don't want the watcher to do this from each node, but
just from the master fetching the instance list and submitting the relevant
jobs. Am I wrong? Why should the watcher query the hypervisor directly??

Thanks,

Guido

Reply via email to