On Mon, Nov 23, 2009 at 01:49:09PM -0600, Anthony Liguori wrote:
> Eduardo Habkost wrote:
>> On Mon, Nov 23, 2009 at 12:28:16PM -0600, Anthony Liguori wrote:
<snip>
>>>>         
>>> After mulling over it a bit, here's what I'd suggest:
>>>
>>> 1) Integrate VMstate with qdev
>>> 2) Introduce a bitmap blacklist for unsupported VMstate versions
>>> 3) Expose that bitmap as a qdev property for each device.
>>> 4) By default, upstream qemu will always set the bitmap to be 100% correct.
>>>
>>> This provides a mechanism for informed users and downstreams to 
>>> reduce  correctness in favor of migration compatibility on a 
>>> case-by-case basis.
>>>     
>>
>> Is this for backward migration?
>
> It's for migrating from an older qemu to a newer one.  Normally, newer  
> qemu will happily support older formats but in this case, we broke  
> something and we need to blacklist the old format.  This lets you  
> override that black list.

Then we can already do that: just return an error on the load function
if version_id is too low.

Doing with a bitmap on qdev would be more flexible, but it is already
possible today.


>
>> If so, even with this bitmap, how would the migration source process
>> know which version it should use when generating the savevm data?
>>   
>
> To properly support this in -M pc-0.11, we'll need to be able to set the  
> version to migrate for each qdev device.  Again, this is something that  
> could be overridden as a qdev property.  The effect would be that we  
> force a newer qemu to generate the older savevm format.

Right. Different from the bitmap, this is something we can't easily
reproduce with current infra-structure.

>
>> (considering that the migration stream is unidirectional, today) We have
>> been considering using a "set-savevm-version" monitor command that would
>> be used by management if backward migration is forced by the user.
>>   
>
> qdev property is the right approach I think.  It's really a per-device  
> setting.  It needs to get tied to machine type too and that's a  
> convenient way to do that.
>
>> BTW, we still have the "machine type" suggestion, that would still keep
>> guest-visible state correctness and allow backward migration when it is
>> 100% correct and safe. With such mechanism, VMs created with the x.y.1
>> machine type could be safely migrated from x.y.2 to x.y.1. (Althought
>> the bitmap suggestion could have some use even on this case, if the user
>> really wants to force migration of a x.y.2 machine to x.y.1).
>>   
>
> In theory, a user can manually specify everything in a machine type.

So the qdev magic would be used to provide input to this sytem. I see.



>
>>> This takes qemu out of the business of creating these sort of 
>>> policies  but allows RHEL to make decisions about what default policy 
>>> it uses.  It  also lets well informed users of RHEL to override those 
>>> policy decisions  when they deem it to be appropriate.
>>>
>>> This would make me happy both from an upstream qemu perspective but 
>>> also  as a consumer of RHEL.
>>>     
>>
>> What about the suggestion of using multiple sections per device, every
>> time a new feature is added, instead of just increasing the version
>> numbers linearly? It allows us to keep the savevm version info
>> consistent on the multiple downstream trees.
>>   
>
> It doesn't because it's just as likely to get clashes in subsection  
> names.  For instance, RHEL5.4 may call the pvclock msr subsection  
> "pvclock-msrs" and then upstream may call it "pvclock-msrs" and flip the  
> order of the fields.

If we implemented this on RHEL before including it upstream, then we
could have a "RHEL" subversion flag set (or simply call it
"pvclock-msrs-rhel"). But if we backport it, there is no reason to make
the version scheme incompatible.


>
> To support downstreams effectively, we need vendor specific versioning  
> so that we can separate the upstream qemu namespace from each of the  
> downstreams.

The problem is that newer versions of downstream code will be branched
off newer upstream versions, in the future. A mechanism that helps
keeping the version numbers compatible where possible (not always) would
facilitate code contribution on both directions.


>
>> Suppose we have the following scenario:
>>
>> 1) Device Foo has features A, B, C on "foo" section, sets version to 1
>> 2) Downstream tree (e.g. RHEL) is branched off upstream
>> 3) Device Foo adds support to feature D, version change to 2
>> 5) Device Foo adds support to feature E, version changed to 3
>> 6) Feature E is backported to a downstream tree. Now it supports
>>    features A,B,C,E, and its versioning scheme will be incompatible with
>>    upstream.
>>   
>
> Downstream adds a "RHEL" subversion.  This allows downstream to add a  
> subversion to each device if it modifies it.  When it backports E, it  
> bumps the downstream version from 0->1.
>
> As long as the backported features aren't enabled, the migration will be  
> compatible to upstream.  Once one of these backported features is  
> enabled, migration will fail gracefully.
>
>> What I suggest is something like:
>>
>> 1) Device Foo has features A,B,C, on "foo" section (or maybe on "foo.a",
>>    "foo.b", and "foo.c" sections, depending if they make sense
>>    individually)
>> 2) Device Foo adds support to feature D, adds "foo.d" section
>> 3) Device Foo adds support to feature E, adds "foo.e" section
>>   
>
> The combinations blow up quickly.  Just because A,B,C,E works for a  
> given downstream, doesn't mean that it would work with the upstream code  
> base.  Features are rarely so independent of one another.
>
> It also doesn't address things like QXL which aren't just a simple  
> matter of a backported upstream feature.

My point is that sometimes they are clearly independent, and when that
happens, keeping the version schemes compatible is a good thing. The
pvclock MSRs are an example: they are clearly independent from the other
MSRs and it wouldn't hurt Qemu if they were added as a separated
section.

There would be other benefits, too: the pvclock MSR section could be
disabled if pvclock support is disabled on the command-line or machine
definition. We would even have an answer to the user that wanted
backward migration: "yeah, if you were so sure the guest OS didn't use
pvclock, you could have disabled pvclock when the VM was created, and
migration would be possible".

Yes, there are many cases where doing this won't be enough or won't be
as simple, and we will need a "sub-version" field like you suggested for
stuff that are too different from upstream. I am just suggesting that we
encourage savevm features to be introduced in a more "modular" way where
possible, to facilitate collaboration.

-- 
Eduardo


Reply via email to