A couple of things:
1. The current implementation of suspend doesn't use the qcow extension
for snapshots, it does a "migration" to a file. In this case, write
performance would be important if you didn't set stopFirst to True in
the call to __stopVm.
2. I seem to recall the other issue being the possibility of a seek +
modify being done to the snapshot. In the case of Qemu though, we're
streaming the snapshot out, so this isn't an issue. I think there was
also some concern over whether HDFS would support even the streaming
properly, but since an implementation of the DFS interface using HDFS
never materialized, I don't think this is a problem (whether or not this
would still be a problem with HDFS, I don't know).
So I think your solution is a good one, Mike S.
More generally though, if you did need to either preserve immediate
write speed, allow seek + modify, or something similar, you could
imagine extending the DFS interface to include a call that provides a
temporary file with those semantics and then a second call to finalize
it. In the VFS case, the two calls could give the path directly and be
a noop. In the case of HDFS, you could provide a path to local scratch
space and then perform the copy in the second call. This doesn't seem
to be necessary at this point, but I think that was the original idea
behind some of the structure.
- Michael
On 3/23/2012 4:41 PM, Kozuch, Michael A wrote:
RETRACTION: I thought the copying under discussion was in the migration path. Somehow I read
"migrate" rather than "suspend". MichaelS's solution seems reasonable.
Mike
-----Original Message-----
From: Kozuch, Michael A [mailto:[email protected]]
Sent: Friday, March 23, 2012 3:48 PM
To: [email protected]
Subject: RE: suspending machines in Tashi
I recall that there was an issue, but I don't recall what it was.
I do agree, though, that two copies seems unnecessary. Another solution might
be to store the snapshot locally, and then send directly to the local disk on
the target client. It's possible that this approach would be more scalable in
that, if I were to migrate 100 VMs, they wouldn't necessarily all use the same
resource.
Mike