Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Stefan Berger

On 10/11/2011 09:54 AM, Anthony Liguori wrote:

On 10/11/2011 08:27 AM, Juan Quintela wrote:

I've been thinking about it this morning.  I think it's solvable.  We 
need to be able to save off the qdev construction properties right 
before init.  This is just a matter of storing a list of strings.  
Then we need a qdev_torture function that will save the device state 
(will require a dummy QEMUFile that saves to memory).  We then need to 
invoke destruction w/o actually freeing the memory of the device.  We 
should then zero out the device memory.


We then need to run through qdev creation, setting properties based on 
the saved construction properties.  Then we should init and invoke the 
device's reset function.  Finally we can pass the dummy QEMUFile to 
the device's load function (or vmstate).

If you want, I have a 'dummy QEMUFile' implementation...

   Stefan




Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Avi Kivity
On 10/11/2011 04:34 PM, Anthony Liguori wrote:
> On 10/11/2011 09:01 AM, Avi Kivity wrote:
>> On 10/11/2011 03:57 PM, Anthony Liguori wrote:
 What I'm trying to avoid is making choices today that close the
 door on
 better fixes in the future.
>>>
>>>
>>> I think Juan made a really good point in his earlier post.  We need to
>>> focus on better testing for migration.  With a solid migration torture
>>> test, we can probably eliminate much of the problems we're facing
>>> today.
>>
>> Agree, fingerprinting vmstate should help a lot.  Actually I don't think
>> the visitor is strictly required, the fingerprinter can just walk
>> vmstate structs.
>
> You mean generating a schema?  

Dumping the vmstate descriptions in a canonical format, and having a
tools that verifies that version A is compatible with version B.

> I was talking about an active migration torture test.

Those are good, but inherently limited.

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Anthony Liguori

On 10/11/2011 09:01 AM, Avi Kivity wrote:

On 10/11/2011 03:57 PM, Anthony Liguori wrote:

What I'm trying to avoid is making choices today that close the door on
better fixes in the future.



I think Juan made a really good point in his earlier post.  We need to
focus on better testing for migration.  With a solid migration torture
test, we can probably eliminate much of the problems we're facing today.


Agree, fingerprinting vmstate should help a lot.  Actually I don't think
the visitor is strictly required, the fingerprinter can just walk
vmstate structs.


You mean generating a schema?  I was talking about an active migration torture 
test.

Regards,

Anthony Liguori









Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Avi Kivity
On 10/11/2011 03:57 PM, Anthony Liguori wrote:
>> What I'm trying to avoid is making choices today that close the door on
>> better fixes in the future.
>
>
> I think Juan made a really good point in his earlier post.  We need to
> focus on better testing for migration.  With a solid migration torture
> test, we can probably eliminate much of the problems we're facing today.

Agree, fingerprinting vmstate should help a lot.  Actually I don't think
the visitor is strictly required, the fingerprinter can just walk
vmstate structs.


-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Anthony Liguori

On 10/11/2011 08:47 AM, Avi Kivity wrote:

On 10/11/2011 03:27 PM, Anthony Liguori wrote:

5) Implement subsections through the wire as top-level sections (as
originally intended).  Keep existing subsections with (1).



That was (3).



Yes, sorry.


btw, it's reasonable to require that backwards migration is only to a
fully updated stable release, so we can do 5) too, or backport 1).


But given the choice of a nasty silent failure to an
not-quite-up-to-date stable release or failing migration to a fully
up-to-date stable release, I think it's better that we err on the side
of caution.


We're erring on the side of no migration, it seems.


Not being able to migrate because of a recoverable failure is
annoying.  Having a silent failure that possible results in corruption
is unacceptable.


What I'm trying to avoid is making choices today that close the door on
better fixes in the future.


I think Juan made a really good point in his earlier post.  We need to focus on 
better testing for migration.  With a solid migration torture test, we can 
probably eliminate much of the problems we're facing today.


Regards,

Anthony Liguori








Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Anthony Liguori

On 10/11/2011 08:27 AM, Juan Quintela wrote:

Avi Kivity  wrote:

On 10/10/2011 01:35 PM, Juan Quintela wrote:

Hi

Please send in any agenda items you are interested in covering.



Subsections, version numbers, migration to older releases.


Subsections
---

- Current subsections are a mess (TM).  The idea was to only have them
at the very end of sections.  So it was clear that it was a section
(start with QEMU_VM_SECTION_START), or a subsection of this section,
(start with QEMU_VM_SUBSECTION).  As you can see, there is no possible
ambiguity.

Guess what happened?  We needed subsections in the middle of the struct,
where we can't warantee what cames after (that can be
QEMU_VM_SUBSECTION).

My last migration "subsection detection fix" fixes this in the majority
of the cases, but you can probably do a case by hand where it happens.

Back to the beggining, Avi wanted/wants that subsections are just normal
sections with a "funny" name ("section_name/subsection_name"), requiring
FIFO ordering or something like that.  So far, so good, but we still
have the problem that:
a- we need to assure that ordering is right (do-able)
b- we need to assure that "post-load" functions are done in the right
order (also do-able)
c- we need to be able from toplevel where we only have pointers to the
general state to find the correct "substruct" pointer that this
subsection refers to.  This is kind of complicated :-(

My sugestion/plan:
- integrate my migration detection fix on upstream + stable

- port all current subsections to avi approach to see about how feasible
   is.  If IDE subsections can be made to work, everything else should be
   doable.

Version numbers
---

What to do here?  Basically we have been able to integrate all changes
so far using subsections (some of them in a non-trivial way, thought).

Last one is the change proposed on wavcapture, I stated some ideas, but
got no answer from the author.  Basically he did an incompatible change
on the driver, and I can't see a trivial way to make it compatible.
Chanels used to be either output/input, and now they need to be both, so
he duplicated the channels.

Migration to older releases
---

Our test framework for that is inexistent.  That is the more important
issue for this to work.  Problem is that nobody really knows how to do
it.


I've been thinking about it this morning.  I think it's solvable.  We need to be 
able to save off the qdev construction properties right before init.  This is 
just a matter of storing a list of strings.  Then we need a qdev_torture 
function that will save the device state (will require a dummy QEMUFile that 
saves to memory).  We then need to invoke destruction w/o actually freeing the 
memory of the device.  We should then zero out the device memory.


We then need to run through qdev creation, setting properties based on the saved 
construction properties.  Then we should init and invoke the device's reset 
function.  Finally we can pass the dummy QEMUFile to the device's load function 
(or vmstate).


I'll take a look at implementing this today.  I think it'll be a bit hairy but 
it looks doable to me.


Regards,

Anthony Liguori



One of the ideas is to run machine, stop, save everything, reload, and
continue.  Or doing it in a loop for each device, but so far, they
haven't moved for the "design" phase (for lack of a better word to
describe "something that is on someone head and needs to be done").

Once here, more migration issues


- VMState finish: Still on ToDo list, once my two series on the list is
   integrated, I expect to work on virtio + other cpus.  No way this is
   going to be done for the 15th, perhaps one week after that.

- migration thread: another thing that I am going to look at, in
   paraller with previous stuff.

Patches on RHEL not in qemu.git
---

- qcow2 consistence for migration: we need to reload qcow2 headers after
   migration, should be an easy case of split open in open + reload.  We
   have decided that we only support migration with cache=none, so part
   of the series is not needed.

- Huge memory machines: Last time I proposed the series, Anthony agreed
   with everything except the last patch (that was a bandaid, I agree).
   Added with the migration thread descrived before, we should be done on
   that department.

Changing the protocol?
-

Except if someone appears and found an use for the new protocol, I will
stay away for changing it.  Things that need to be done once that we
change the protocol in an incopmatible way:

- send command line arguments through the migration channel, at least
   put support for it there.  Needs qdev/QOM or whatever changes first.

- put sections size/end markers.

- fix the arrays mess.  Basically we need to send things like:
total size of array (think malloc)
number of elements used (how many we sent)
start: (w

Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Avi Kivity
On 10/11/2011 03:27 PM, Anthony Liguori wrote:
>> 5) Implement subsections through the wire as top-level sections (as
>> originally intended).  Keep existing subsections with (1).
>
>
> That was (3).
>

Yes, sorry.

>> btw, it's reasonable to require that backwards migration is only to a
>> fully updated stable release, so we can do 5) too, or backport 1).
>
> But given the choice of a nasty silent failure to an
> not-quite-up-to-date stable release or failing migration to a fully
> up-to-date stable release, I think it's better that we err on the side
> of caution.

We're erring on the side of no migration, it seems.

> Not being able to migrate because of a recoverable failure is
> annoying.  Having a silent failure that possible results in corruption
> is unacceptable.

What I'm trying to avoid is making choices today that close the door on
better fixes in the future.

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Juan Quintela
Avi Kivity  wrote:
> On 10/10/2011 01:35 PM, Juan Quintela wrote:
>> Hi
>>
>> Please send in any agenda items you are interested in covering.
>>
>
> Subsections, version numbers, migration to older releases.

Subsections
---

- Current subsections are a mess (TM).  The idea was to only have them
at the very end of sections.  So it was clear that it was a section
(start with QEMU_VM_SECTION_START), or a subsection of this section,
(start with QEMU_VM_SUBSECTION).  As you can see, there is no possible
ambiguity.

Guess what happened?  We needed subsections in the middle of the struct,
where we can't warantee what cames after (that can be
QEMU_VM_SUBSECTION).

My last migration "subsection detection fix" fixes this in the majority
of the cases, but you can probably do a case by hand where it happens.

Back to the beggining, Avi wanted/wants that subsections are just normal
sections with a "funny" name ("section_name/subsection_name"), requiring
FIFO ordering or something like that.  So far, so good, but we still
have the problem that:
a- we need to assure that ordering is right (do-able)
b- we need to assure that "post-load" functions are done in the right
   order (also do-able)
c- we need to be able from toplevel where we only have pointers to the
   general state to find the correct "substruct" pointer that this
   subsection refers to.  This is kind of complicated :-(

My sugestion/plan:
- integrate my migration detection fix on upstream + stable

- port all current subsections to avi approach to see about how feasible
  is.  If IDE subsections can be made to work, everything else should be
  doable.

Version numbers
---

What to do here?  Basically we have been able to integrate all changes
so far using subsections (some of them in a non-trivial way, thought).

Last one is the change proposed on wavcapture, I stated some ideas, but
got no answer from the author.  Basically he did an incompatible change
on the driver, and I can't see a trivial way to make it compatible.
Chanels used to be either output/input, and now they need to be both, so
he duplicated the channels.

Migration to older releases
---

Our test framework for that is inexistent.  That is the more important
issue for this to work.  Problem is that nobody really knows how to do
it.

One of the ideas is to run machine, stop, save everything, reload, and
continue.  Or doing it in a loop for each device, but so far, they
haven't moved for the "design" phase (for lack of a better word to
describe "something that is on someone head and needs to be done").

Once here, more migration issues


- VMState finish: Still on ToDo list, once my two series on the list is
  integrated, I expect to work on virtio + other cpus.  No way this is
  going to be done for the 15th, perhaps one week after that.

- migration thread: another thing that I am going to look at, in
  paraller with previous stuff.

Patches on RHEL not in qemu.git
---

- qcow2 consistence for migration: we need to reload qcow2 headers after
  migration, should be an easy case of split open in open + reload.  We
  have decided that we only support migration with cache=none, so part
  of the series is not needed.

- Huge memory machines: Last time I proposed the series, Anthony agreed
  with everything except the last patch (that was a bandaid, I agree).
  Added with the migration thread descrived before, we should be done on
  that department.

Changing the protocol?
-

Except if someone appears and found an use for the new protocol, I will
stay away for changing it.  Things that need to be done once that we
change the protocol in an incopmatible way:

- send command line arguments through the migration channel, at least
  put support for it there.  Needs qdev/QOM or whatever changes first.

- put sections size/end markers.

- fix the arrays mess.  Basically we need to send things like:
   total size of array (think malloc)
   number of elements used (how many we sent)
   start: (we don't always sent data from the start)
   circular buffers:  At the moment, we use some arrays as "circular
   buffers", and we just send to the beginning.
   this is from top of memory, going through all the array users will
   make things clearer.
   index of array, we have for index everything, int8_t, uint8_t,
   int16_t, uint16_t, int32_t, uint32_t.  We should just use one type
   for index, and make all our arrays simpler.

- bitmaps: we need a type to sent bitmaps, period.

- remove all the warts that we don't need anymore due to backward
  compatibility.

- cpus: specially x86_*. Our format support for x86 is a mess, things
  like: - how to store doubles (at least 4 formats)
- a generic way of sending a list of MSR's is needed.  We are
  going to need more MSR's in the future, and we are having a
  new subsection/version for each new MSR.  To make things
 

Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Anthony Liguori

On 10/11/2011 08:21 AM, Avi Kivity wrote:

On 10/11/2011 03:01 PM, Anthony Liguori wrote:

On 10/11/2011 06:48 AM, Avi Kivity wrote:

On 10/10/2011 01:35 PM, Juan Quintela wrote:

Hi

Please send in any agenda items you are interested in covering.



Subsections, version numbers, migration to older releases.


Problem with subsections:

The encoding of a subsection within an embedded structure is ambiguous
because the subsection occurs at the end of the structure.  QEMU may
mistakenly parse what follows the structure as the end of subsection
deliminator.

Possible solutions:

1) Juan has a series that adds heuristics to better match the EOS
deliminator. While not 100% perfect, it should handle practically all
possible cases.

The main issue is that it's not present in older QEMUs which means
migrating a subsection within a structure to an old QEMU that doesn't
have this heuristic could fail.

Ways to mitigate: force all devices with subsections to bump their
version number.  Wave our hands around and claim that the new version
requires the subsection heuristics to be present.

2) Add Paolo's protocol change.  This will cause a migration flag
day.  Since we want to switch to ASN.1 too, we'll have another flag
day for the next release too.

3) Change subsection protocol more dramatically than Paolo's change
(make subsections stand alone sections).  Not clear how much effort
this is.

4) Avoid subsections until we introduce a new wire protocol based on
ASN.1 that can better handle concepts like subsections.  This misses
some opportunity for backwards compatibility in the short term but
avoids repeated flag days.



5) Implement subsections through the wire as top-level sections (as
originally intended).  Keep existing subsections with (1).


That was (3).


btw, it's reasonable to require that backwards migration is only to a
fully updated stable release, so we can do 5) too, or backport 1).


But given the choice of a nasty silent failure to an not-quite-up-to-date stable 
release or failing migration to a fully up-to-date stable release, I think it's 
better that we err on the side of caution.


Not being able to migrate because of a recoverable failure is annoying.  Having 
a silent failure that possible results in corruption is unacceptable.


Regards,

Anthony Liguori





Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Avi Kivity
On 10/11/2011 03:01 PM, Anthony Liguori wrote:
> On 10/11/2011 06:48 AM, Avi Kivity wrote:
>> On 10/10/2011 01:35 PM, Juan Quintela wrote:
>>> Hi
>>>
>>> Please send in any agenda items you are interested in covering.
>>>
>>
>> Subsections, version numbers, migration to older releases.
>
> Problem with subsections:
>
> The encoding of a subsection within an embedded structure is ambiguous
> because the subsection occurs at the end of the structure.  QEMU may
> mistakenly parse what follows the structure as the end of subsection
> deliminator.
>
> Possible solutions:
>
> 1) Juan has a series that adds heuristics to better match the EOS
> deliminator. While not 100% perfect, it should handle practically all
> possible cases.
>
> The main issue is that it's not present in older QEMUs which means
> migrating a subsection within a structure to an old QEMU that doesn't
> have this heuristic could fail.
>
> Ways to mitigate: force all devices with subsections to bump their
> version number.  Wave our hands around and claim that the new version
> requires the subsection heuristics to be present.
>
> 2) Add Paolo's protocol change.  This will cause a migration flag
> day.  Since we want to switch to ASN.1 too, we'll have another flag
> day for the next release too.
>
> 3) Change subsection protocol more dramatically than Paolo's change
> (make subsections stand alone sections).  Not clear how much effort
> this is.
>
> 4) Avoid subsections until we introduce a new wire protocol based on
> ASN.1 that can better handle concepts like subsections.  This misses
> some opportunity for backwards compatibility in the short term but
> avoids repeated flag days.
>

5) Implement subsections through the wire as top-level sections (as
originally intended).  Keep existing subsections with (1).

btw, it's reasonable to require that backwards migration is only to a
fully updated stable release, so we can do 5) too, or backport 1).

-- 
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Anthony Liguori

On 10/11/2011 06:48 AM, Avi Kivity wrote:

On 10/10/2011 01:35 PM, Juan Quintela wrote:

Hi

Please send in any agenda items you are interested in covering.



Subsections, version numbers, migration to older releases.


Problem with subsections:

The encoding of a subsection within an embedded structure is ambiguous because 
the subsection occurs at the end of the structure.  QEMU may mistakenly parse 
what follows the structure as the end of subsection deliminator.


Possible solutions:

1) Juan has a series that adds heuristics to better match the EOS deliminator. 
While not 100% perfect, it should handle practically all possible cases.


The main issue is that it's not present in older QEMUs which means migrating a 
subsection within a structure to an old QEMU that doesn't have this heuristic 
could fail.


Ways to mitigate: force all devices with subsections to bump their version 
number.  Wave our hands around and claim that the new version requires the 
subsection heuristics to be present.


2) Add Paolo's protocol change.  This will cause a migration flag day.  Since we 
want to switch to ASN.1 too, we'll have another flag day for the next release too.


3) Change subsection protocol more dramatically than Paolo's change (make 
subsections stand alone sections).  Not clear how much effort this is.


4) Avoid subsections until we introduce a new wire protocol based on ASN.1 that 
can better handle concepts like subsections.  This misses some opportunity for 
backwards compatibility in the short term but avoids repeated flag days.


Regards,

Anthony Liguori








Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Anthony Liguori

On 10/11/2011 06:36 AM, Paolo Bonzini wrote:

On 10/10/2011 01:35 PM, Juan Quintela wrote:


Hi

Please send in any agenda items you are interested in covering.


Planning the feature freeze:

- what is left to merge?

- test day?


Great topic.  Just a reminder, we're looking at release dates of:

| 2011-10-15
| Soft freeze
|-
| 2011-11-01
| Hard master
|-
| 2011-11-07
| Tag qemu-1.0-rc1
|-
| 2011-11-14
| Tag qemu-1.0-rc2
|-
| 2011-11-21
| Tag qemu-1.0-rc3
|-
| 2011-11-28
| Tag qemu-1.0-rc4
|-
| 2011-12-01
| Tag qemu-1.0

Soft Freeze FAQ:

== What is the soft feature freeze? ==

The soft feature freeze is the beginning of the stabilization phase of QEMU's 
development process.  By the date of the soft feature freeze, any major feature 
should have some code posted to the qemu-devel mailing list if it's targeting a 
given release.


== What should I do by the soft feature freeze? ==

For any major feature that you're targeting to the next release, you should:

# Make sure that you've posted a patch series to qemu-devel
# Write a Feature page on the qemu.org wiki describing the feature and the 
motivation

# On the release planning wiki page, link to your feature wiki page.

== Will my patches be rejected if I don't post before the soft feature freeze? 
==

That's ultimately up to the subsystem maintainer.  It's a value call based on 
the relative importance of the feature verses the disruptiveness of the feature. 
 It's always best to avoid this problem in the first place and release early, 
release often[http://en.wikipedia.org/wiki/Release_early,_release_often].


Regards,

Anthony Liguori



Paolo






Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Avi Kivity

On 10/10/2011 01:35 PM, Juan Quintela wrote:

Hi

Please send in any agenda items you are interested in covering.



Subsections, version numbers, migration to older releases.

--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] KVM call agenda for October 11th

2011-10-11 Thread Paolo Bonzini

On 10/10/2011 01:35 PM, Juan Quintela wrote:


Hi

Please send in any agenda items you are interested in covering.


Planning the feature freeze:

- what is left to merge?

- test day?

Paolo



[Qemu-devel] KVM call agenda for October 11th

2011-10-10 Thread Juan Quintela

Hi

Please send in any agenda items you are interested in covering.

Thanks, Juan.


pgpsWpNSfkqQb.pgp
Description: PGP signature