On 09/13/2010 08:39 AM, Kevin Wolf wrote:
Yeah, one of the key design points of live migration is to minimize the
number of failure scenarios where you lose a VM. If someone typed the
wrong command line or shared storage hasn't been mounted yet and we
delay failure until live migration is in the critical path, that would
be terribly unfortunate.
We would catch most of them if we try to open the image when migration
starts and immediately close it again until migration is (almost)
completed, so that no other code can possibly use it before the source
has really closed it.
I think the only real advantage is that we fix NFS migration, right?
But if we do invalidate_cache() as you suggested with a close/open of
the qcow2 layer, and also acquire and release a lock in the file layer
by propagating the invalidate_cache(), that should work robustly with NFS.
I think that's a simpler change. Do you see additional advantages to
delaying the open?
Regards,
anthony Liguori
Kevin