On Mon, Nov 15, 2010 at 10:09:29AM -0600, Anthony Liguori wrote: > On 11/15/2010 09:18 AM, Michael S. Tsirkin wrote: > >On Mon, Nov 15, 2010 at 08:55:07AM -0600, Anthony Liguori wrote: > >>On 11/15/2010 08:52 AM, Juan Quintela wrote: > >>>"Michael S. Tsirkin"<m...@redhat.com> wrote: > >>>>There's no reason for tap to run when VM is stopped. > >>>>If we let it, it confuses the bridge on TX > >>>>and corrupts DMA memory on RX. > >>>> > >>>>Signed-off-by: Michael S. Tsirkin<m...@redhat.com> > >>>once here, what handlers make sense to run while stopped? > >>>/me can think of the normal console, non live migration, loadvm and not > >>>much more. Perhaps it is easier to just move the other way around? > >>I'm not sure I concur that this is really a problem. > >>Semantically, I don't think that stop has to imply that the guest > >>memory no longer changes. > >> > >>Regards, > >> > >>Anthony Liguori > >> > >>>Later, Juan. > >Well, I do not really know about vmstop that is not for migration. > > They are separate. For instance, we don't rely on stop to pause > pending disk I/O because we don't serialize pending disk I/O > operations. Instead, we flush all pending I/O and rely on the fact > that disk I/O requests are always submitted in the context of a vCPU > operation. This assumption breaks down though with ioeventfd so we > need to revisit it. > > >For most vmstop calls are for migration. And there, the problems are very > >real. > > > >First, it's not just memory. At least for network transmit, sending out > >packets with the same MAC from two locations is a problem. See? > > I agree it's a problem but I'm not sure that just marking fd > handlers really helps. > > Bottom halves can also trigger transmits.
Are there any system ones? Can we just stop processing them? > I think that if we put > something in the network layer that just queued packets if the vm is > stopped, it would be a more robust solution to the problem. Will only work for -net. The problem is for anything that can trigger activity when vm is stopped. > >For memory, it is much worse: any memory changes can either get > >discarded or not. This breaks consistency guarantees that guest relies > >upon. Imagine virtio index getting updated but content not being > >updated. See? > > If you suppress any I/O then the memory changes don't matter because > the same changes will happen on the destination too. They matter, and same changes won't happen. Example: virtio used index is in page 1, it can point at data in page 2. device writes into data, *then* into index. Order matters, but won't be preserved: migration assumes memory does not change after vmstop, and so it might send old values for data but new values for index. Result will be invalid data coming into guest. On the destination guest will pick up the index and get bad (stale) data. > I think this basic problem is the same as Kemari. We can either > attempt to totally freeze a guest which means stopping all callbacks > that are device related or we can prevent I/O from happening which > should introduce enough determinism to fix the problem in practice. > > Regards, > > Anthony Liguori See above. IMO it's a different problem. Unlike Kemari, I don't really see any drawbacks to stop all callbacks. Do you? -- MST