On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote:
On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote:
On 11/15/2010 08:52 AM, Juan Quintela wrote:
"Michael S. Tsirkin"<m...@redhat.com> wrote:
There's no reason for tap to run when VM is stopped.
If we let it, it confuses the bridge on TX
and corrupts DMA memory on RX.
Signed-off-by: Michael S. Tsirkin<m...@redhat.com>
once here, what handlers make sense to run while stopped?
/me can think of the normal console, non live migration, loadvm and not
much more. Perhaps it is easier to just move the other way around?
I'm not sure I concur that this is really a problem.
Semantically, I don't think that stop has to imply that the guest
memory no longer changes.
Regards,
Anthony Liguori
Later, Juan.
Well, I do not really know about vmstop that is not for migration.
They are separate. For instance, we don't rely on stop to pause pending
disk I/O because we don't serialize pending disk I/O operations.
Instead, we flush all pending I/O and rely on the fact that disk I/O
requests are always submitted in the context of a vCPU operation. This
assumption breaks down though with ioeventfd so we need to revisit it.
For most vmstop calls are for migration. And there, the problems are very
real.
First, it's not just memory. At least for network transmit, sending out
packets with the same MAC from two locations is a problem. See?
I agree it's a problem but I'm not sure that just marking fd handlers
really helps.
Bottom halves can also trigger transmits. I think that if we put
something in the network layer that just queued packets if the vm is
stopped, it would be a more robust solution to the problem.
For memory, it is much worse: any memory changes can either get
discarded or not. This breaks consistency guarantees that guest relies
upon. Imagine virtio index getting updated but content not being
updated. See?
If you suppress any I/O then the memory changes don't matter because the
same changes will happen on the destination too.
I think this basic problem is the same as Kemari. We can either attempt
to totally freeze a guest which means stopping all callbacks that are
device related or we can prevent I/O from happening which should
introduce enough determinism to fix the problem in practice.
Regards,
Anthony Liguori