On 10/05/2011 08:50 PM, Luiz Capitulino wrote:
> > > I'm not exactly against the semantics you're proposing, but they don't
> > > seem to fit today's qemu.
> >
> > Today's qemu is broken here.
>
> For me it's broken because it will abort() if you migrate a paused vm, for
> you it seems to be broken at the semantic level.
>
> We can fix the semantics without breaking compatibility.
s/We can/ We can't
I think we should divide stop causes into three groups:
1) those that are undone by QEMU itself:
RSTATE_DEBUG
RSTATE_SAVEVM
RSTATE_PRE_MIGRATE
RSTATE_RESTORE
For these a lock/release scheme is definitely better. The VM should not
start until none of these conditions is in effect, even after a "cont"
command.
2) those that are undone by management:
RSTATE_IO_ERROR
For this we can add a new "retry" monitor command that guarantees no
races if the user issues a "stop" or "cont" command while management is
processing it. Effectively, it is also a lock/release scheme but
controlled by management.
3) those that are undone by "cont":
RSTATE_PRE_LAUNCH
RSTATE_PAUSED
RSTATE_WATCHDOG
RSTATE_POST_MIGRATE
RSTATE_PANICKED
It put here the three runstates where the VM should really not be
restarted at all. We can then add a new "start" command that only flips
these five to RSTATE_RUNNING.
So the runstate is composed of six elements: five lock/unlock states (of
which only one can be unlocked by the user), and one running/paused
state (composed of five pause reasons + "none"). That is, the runstate
is a tuple like [debug, savevm, pre_migrate, restore, io_error,
pause_reason] and for the VM to run it must look like [false, false,
false, false, false, none].
The four monitor commands would be:
1) "stop":
if runstate[pause_reason] == none then
runstate[pause_reason] = paused
2) "retry":
runstate[io_error] = false
3) "start":
runstate[pause_reason] = none
There could also be a differentiation between "start" and "start -f",
where "-f" would be needed to get out of RSTATE_POST_MIGRATE,
RSTATE_PANICKED and probably RSTATE_WATCHDOG too.
4) "cont": backwards compatibility provided by "retry"+"start -f".
How does this look?
Paolo