Re: [libvirt] [RFC] Fixing a regression caused by recent CPU driver changes

2017-05-22 Thread Daniel P. Berrange
On Thu, May 18, 2017 at 10:22:59AM +0200, Jiri Denemark wrote:
> The big question is how to fix the regression in a backward compatible
> way and still keep the ability to properly check guest CPU ABI with new
> enough libvirt and QEMU. Clearly, we need to keep both the original and
> the updated CPU definition (or one of them and a diff against the
> other).
> 
> I came up with the following options:
> 
> 1. When migrating a domain, the domain XML would contain the original
>CPU def while the updated one would be transferred in a migration
>cookie.
>- doesn't fix save/restore or snapshot revert
>  - could be fixed by not updating the CPU with QEMU < 2.9.0, but it
>won't work when restore is done on a different host with older
>QEMU (and yes, this happens in real life, e.g., with oVirt)
>- doesn't even fix migration after save/restore or snapshot revert

Yep, not an option.

> 
> 2. Use migration cookie and change the save image format to contain
>additional data (something like migration cookie) which a new libvirt
>could make use of while old libvirt would ignore any additional data
>it doesn't know about
>- snapshot XML would need to be updated too, but that's trivial
>- cleanly changing save image format requires its version to be
>  increased and old libvirt will just refuse to read such image
>- this would fix save/restore on the same host or on a host with
>  older QEMU
>- doesn't fix restore on a different host running older libvirt
>  - even this is done by oVirt

The only way we can change save image format, is if we ensure we use
the old format by default, and require a manual VIR_DOMAIN_SAVE_CPU_CHECK
flag to turn on the "new" format.

> 3. Use migration cookie and change the save image format without
>increasing its version (and update snapshot XML)
>- this fixes all cases
>- technically possible by using one of the unused 32b ints and
>  adding \0 at the end of the domain XML followed by anothe XML with
>  the additional data (pointed to by the formerly unused uint32_t)
>- very ugly hack

While it seems ugly, this is kind of what the 'unused' ints are
there for in the first place.

Basically for required incompatible changes we'd bump version, but
for safe opt-in changes we can just use an unused field. So I think
this is ok


> 4. Format both CPUs in domain XML by adding a new subelement in 
>which would list all extra features disabled or enabled by QEMU
>- fixes all cases without the need to use migration cookie, change
>  save image format, or snapshot XML
>- but it may be very confusing for users and it would need to be
>  documented as "output only, don't touch staff"

Yeah this just looks too ugly.

> So my preferred solution is 2. It breaks restore on a host with older
> libvirt, but it was never guaranteed to work, even though it usually
> just worked. And we could just tell oVirt to fix there usage :-) Also
> additional data in save image header may be very useful in the future; I
> think I remember someone sighing about the inability to store more than
> just domain XML in a saved image when they were trying to fix some
> compatibility issues.

I think 2 is ok, if we use the VIR_DOMAIN_SAVE_CPU_CHECK to opt-in to
writing the new format with expanded CPU.

It means OpenStack/oVirt would *not* use this flag by default - it would
need to be an opt-in once the admin knows they don't have any nodes with
older version in use.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC] Fixing a regression caused by recent CPU driver changes

2017-05-18 Thread Martin Kletzander

On Thu, May 18, 2017 at 11:14:50AM +0200, Pavel Hrdina wrote:

On Thu, May 18, 2017 at 10:22:59AM +0200, Jiri Denemark wrote:

Hi all,

when I was enhancing libvirt's guest CPU configuration code to be able
to really ensure stable guest CPU ABI, I added a new attribute
//cpu/@check which is nicely backward compatible... an old libvirt will
just ignore it. However, even if check='full' will be ignored, an old
libvirt will still see the updated CPU definition (features added or
removed by the hypervisor will be shown there). And because we need QEMU
2.9.0 to check what features are going to be added or removed before we
actually start the domain, migrating such domain to an older libvirt or
QEMU may fail if QEMU enables a feature which is not supported by the
host CPU. Known features causing problems are, e.g., x2apic, hypervisor,
and arat. To make things even worse, updating a CPU definition with the
automatically added/removed features can be done since QEMU 1.5.0.

Even save/restore or snapshot revert on a single host running new
libvirt and QEMU < 2.9.0 is now affected by this regression.

The big question is how to fix the regression in a backward compatible
way and still keep the ability to properly check guest CPU ABI with new
enough libvirt and QEMU. Clearly, we need to keep both the original and
the updated CPU definition (or one of them and a diff against the
other).

I came up with the following options:

1. When migrating a domain, the domain XML would contain the original
   CPU def while the updated one would be transferred in a migration
   cookie.
   - doesn't fix save/restore or snapshot revert
 - could be fixed by not updating the CPU with QEMU < 2.9.0, but it
   won't work when restore is done on a different host with older
   QEMU (and yes, this happens in real life, e.g., with oVirt)
   - doesn't even fix migration after save/restore or snapshot revert


This is clearly not good enough.


2. Use migration cookie and change the save image format to contain
   additional data (something like migration cookie) which a new libvirt
   could make use of while old libvirt would ignore any additional data
   it doesn't know about
   - snapshot XML would need to be updated too, but that's trivial
   - cleanly changing save image format requires its version to be
 increased and old libvirt will just refuse to read such image
   - this would fix save/restore on the same host or on a host with
 older QEMU
   - doesn't fix restore on a different host running older libvirt
 - even this is done by oVirt


This would be the best solution, however we might need to go with
option 3.  The CPU updating is a new feature but it's enabled
automatically without any way how to disable it and in this case we
should not break anything.

Changing the save image format by introducing new feature to
seve/restore APIs is valid reason to increase a version so old libvirt
will fail if the new feature of save/restore is used.  Unfortunately
this is not the case, we need to update the save image format to make
different feature backward compatible which should not break the save
image compatibility.


3. Use migration cookie and change the save image format without
   increasing its version (and update snapshot XML)
   - this fixes all cases
   - technically possible by using one of the unused 32b ints and
 adding \0 at the end of the domain XML followed by anothe XML with
 the additional data (pointed to by the formerly unused uint32_t)
   - very ugly hack


It's not that ugly.  We just need to workaround a bad design used to
store domain XML in the save image.  Currently there is no easy and
nice way how to add additional data into the XML without breaking
backward compatibility and using \0 is the only backward compatible way
of doing it.  Luckily we have another 15 unused 32b ints :).

The ideal solution would be to use proper XML:


 
   ...
 
 
   ...
 
 
   ...
 
 ...


but since we store the domain XML directly as binary data the only way
how to store additional data for backward compatible features we need to
"extend" the current binary data with additional XML:


 ...

\0

 
   ...
 
 
   ...
 
 ...


and the newly used 32b int will just point to the beginning of the
additional data.



Actually, one bit is enough.  It would just say that there are newer
save image data after the first one.  It could be daisy chained as the
next one would just have that field as well.

I'm ambivalent between 3 and 4.


4. Format both CPUs in domain XML by adding a new subelement in 
   which would list all extra features disabled or enabled by QEMU
   - fixes all cases without the need to use migration cookie, change
 save image format, or snapshot XML
   - but it may be very confusing for users and it would need to be
 documented as "output only, don't touch staff"


I don't like it, this just puts an extra noise into the XML that is
irrelevant for users.


5. Add the additional data in 

Re: [libvirt] [RFC] Fixing a regression caused by recent CPU driver changes

2017-05-18 Thread Pavel Hrdina
On Thu, May 18, 2017 at 10:22:59AM +0200, Jiri Denemark wrote:
> Hi all,
> 
> when I was enhancing libvirt's guest CPU configuration code to be able
> to really ensure stable guest CPU ABI, I added a new attribute
> //cpu/@check which is nicely backward compatible... an old libvirt will
> just ignore it. However, even if check='full' will be ignored, an old
> libvirt will still see the updated CPU definition (features added or
> removed by the hypervisor will be shown there). And because we need QEMU
> 2.9.0 to check what features are going to be added or removed before we
> actually start the domain, migrating such domain to an older libvirt or
> QEMU may fail if QEMU enables a feature which is not supported by the
> host CPU. Known features causing problems are, e.g., x2apic, hypervisor,
> and arat. To make things even worse, updating a CPU definition with the
> automatically added/removed features can be done since QEMU 1.5.0.
> 
> Even save/restore or snapshot revert on a single host running new
> libvirt and QEMU < 2.9.0 is now affected by this regression.
> 
> The big question is how to fix the regression in a backward compatible
> way and still keep the ability to properly check guest CPU ABI with new
> enough libvirt and QEMU. Clearly, we need to keep both the original and
> the updated CPU definition (or one of them and a diff against the
> other).
> 
> I came up with the following options:
> 
> 1. When migrating a domain, the domain XML would contain the original
>CPU def while the updated one would be transferred in a migration
>cookie.
>- doesn't fix save/restore or snapshot revert
>  - could be fixed by not updating the CPU with QEMU < 2.9.0, but it
>won't work when restore is done on a different host with older
>QEMU (and yes, this happens in real life, e.g., with oVirt)
>- doesn't even fix migration after save/restore or snapshot revert

This is clearly not good enough.

> 2. Use migration cookie and change the save image format to contain
>additional data (something like migration cookie) which a new libvirt
>could make use of while old libvirt would ignore any additional data
>it doesn't know about
>- snapshot XML would need to be updated too, but that's trivial
>- cleanly changing save image format requires its version to be
>  increased and old libvirt will just refuse to read such image
>- this would fix save/restore on the same host or on a host with
>  older QEMU
>- doesn't fix restore on a different host running older libvirt
>  - even this is done by oVirt

This would be the best solution, however we might need to go with 
option 3.  The CPU updating is a new feature but it's enabled
automatically without any way how to disable it and in this case we
should not break anything.

Changing the save image format by introducing new feature to
seve/restore APIs is valid reason to increase a version so old libvirt
will fail if the new feature of save/restore is used.  Unfortunately
this is not the case, we need to update the save image format to make
different feature backward compatible which should not break the save
image compatibility.

> 3. Use migration cookie and change the save image format without
>increasing its version (and update snapshot XML)
>- this fixes all cases
>- technically possible by using one of the unused 32b ints and
>  adding \0 at the end of the domain XML followed by anothe XML with
>  the additional data (pointed to by the formerly unused uint32_t)
>- very ugly hack

It's not that ugly.  We just need to workaround a bad design used to
store domain XML in the save image.  Currently there is no easy and
nice way how to add additional data into the XML without breaking
backward compatibility and using \0 is the only backward compatible way
of doing it.  Luckily we have another 15 unused 32b ints :).

The ideal solution would be to use proper XML:


  
...
  
  
...
  
  
...
  
  ...


but since we store the domain XML directly as binary data the only way
how to store additional data for backward compatible features we need to
"extend" the current binary data with additional XML:


  ...

\0

  
...
  
  
...
  
  ...


and the newly used 32b int will just point to the beginning of the
additional data.

> 4. Format both CPUs in domain XML by adding a new subelement in 
>which would list all extra features disabled or enabled by QEMU
>- fixes all cases without the need to use migration cookie, change
>  save image format, or snapshot XML
>- but it may be very confusing for users and it would need to be
>  documented as "output only, don't touch staff"

I don't like it, this just puts an extra noise into the XML that is
irrelevant for users.

> 5. Add the additional data in 
>- a bad joke really
> 
> 
> So my preferred solution is 2. It breaks restore on a host with older
> libvirt, but it was never guaranteed to work, even