Branch: refs/heads/staging
  Home:   https://github.com/qemu/qemu
  Commit: a701c03decf140c7666edee1301e151714e50a72
      
https://github.com/qemu/qemu/commit/a701c03decf140c7666edee1301e151714e50a72
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-20 (Thu, 20 Jun 2024)

  Changed paths:
    M migration/file.c

  Log Message:
  -----------
  migration: Drop reference to QIOChannel if file seeking fails

We forgot to drop the reference to the QIOChannel in the error path of
the offset adjustment. Do it now.

Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 6d3279655ac49b806265f08415165f471d33e032
      
https://github.com/qemu/qemu/commit/6d3279655ac49b806265f08415165f471d33e032
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-20 (Thu, 20 Jun 2024)

  Changed paths:
    M migration/file.c

  Log Message:
  -----------
  migration: Fix file migration with fdset

When the "file:" migration support was added we missed the special
case in the qemu_open_old implementation that allows for a particular
file name format to be used to refer to a set of file descriptors that
have been previously provided to QEMU via the add-fd QMP command.

When using this fdset feature, we should not truncate the migration
file because being given an fd means that the management layer is in
control of the file and will likely already have some data written to
it. This is further indicated by the presence of the 'offset'
argument, which indicates the start of the region where QEMU is
allowed to write.

Fix the issue by replacing the O_TRUNC flag on open by an ftruncate
call, which will take the offset into consideration.

Fixes: 385f510df5 ("migration: file URI offset")
Suggested-by: Daniel P. Berrangé <berra...@redhat.com>
Reviewed-by: Prasad Pandit <p...@fedoraproject.org>
Reviewed-by: Peter Xu <pet...@redhat.com>
Reviewed-by: Daniel P. Berrangé <berra...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 55fc0c2f68ec81cc51f39964cf6d1bf3f7467a4f
      
https://github.com/qemu/qemu/commit/55fc0c2f68ec81cc51f39964cf6d1bf3f7467a4f
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-20 (Thu, 20 Jun 2024)

  Changed paths:
    M tests/qtest/migration-test.c

  Log Message:
  -----------
  tests/qtest/migration: Fix file migration offset check

When doing file migration, QEMU accepts an offset that should be
skipped when writing the migration stream to the file. The purpose of
the offset is to allow the management layer to put its own metadata at
the start of the file.

We have tests for this in migration-test, but only testing that the
migration stream starts at the correct offset and not that it actually
leaves the data intact. Unsurprisingly, there's been a bug in that
area that the tests didn't catch.

Fix the tests to write some data to the offset region and check that
it's actually there after the migration.

While here, switch to using g_get_file_contents() which is more
portable than mmap().

Reviewed-by: Peter Xu <pet...@redhat.com>
Reviewed-by: Daniel P. Berrangé <berra...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 926554c0bfdfbf7b058ed370c2b484e56b126d34
      
https://github.com/qemu/qemu/commit/926554c0bfdfbf7b058ed370c2b484e56b126d34
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-20 (Thu, 20 Jun 2024)

  Changed paths:
    M tests/qtest/migration-test.c

  Log Message:
  -----------
  tests/qtest/migration: Add a precopy file test with fdset

Add a test for file migration using fdset. The passing of fds is more
complex than using a file path. This is also the scenario where it's
most important we ensure that the initial migration stream offset is
respected because the fdset interface is the one used by the
management layer when providing a non empty migration file.

Note that fd passing is not available on Windows, so anything that
uses add-fd needs to exclude that platform.

Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 1cd93fb0bf8b1fddab4c38e17145cc8776eadaa0
      
https://github.com/qemu/qemu/commit/1cd93fb0bf8b1fddab4c38e17145cc8776eadaa0
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-20 (Thu, 20 Jun 2024)

  Changed paths:
    M include/monitor/monitor.h
    M monitor/fds.c
    M stubs/fdset.c
    M util/osdep.c

  Log Message:
  -----------
  monitor: Drop monitor_fdset_dup_fd_find/_remove()

Those functions are not needed, one remove function should already
work.  Clean it up.

Here the code doesn't really care about whether we need to keep that dupfd
around if close() failed: when that happens something got very wrong,
keeping the dup_fd around the fdsets may not help that situation so far.

Cc: Dr. David Alan Gilbert <d...@treblig.org>
Cc: Markus Armbruster <arm...@redhat.com>
Cc: Philippe Mathieu-Daudé <phi...@linaro.org>
Cc: Paolo Bonzini <pbonz...@redhat.com>
Cc: Daniel P. Berrangé <berra...@redhat.com>
Signed-off-by: Peter Xu <pet...@redhat.com>
Reviewed-by: Daniel P. Berrangé <berra...@redhat.com>
[add missing return statement, removal during traversal is not safe]
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: a93ad56053e54a94875faabb042d7c60fdd2fe20
      
https://github.com/qemu/qemu/commit/a93ad56053e54a94875faabb042d7c60fdd2fe20
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-20 (Thu, 20 Jun 2024)

  Changed paths:
    M monitor/fds.c

  Log Message:
  -----------
  monitor: Introduce monitor_fdset_*free

Introduce new functions to remove and free no longer used fds and
fdsets.

We need those to decouple the remove/free routines from
monitor_fdset_cleanup() which will go away in the next patches.

The new functions:

- monitor_fdset_free/_if_empty() will be used when a monitor
  connection closes and when an fd is removed to cleanup any fdset
  that is now empty.

- monitor_fdset_fd_free() will be used to remove one or more fds that
  have been explicitly targeted by qmp_remove_fd().

Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 87d67fadb9db5e87072c244df794c0755150fd2a
      
https://github.com/qemu/qemu/commit/87d67fadb9db5e87072c244df794c0755150fd2a
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M monitor/fds.c
    M monitor/hmp.c
    M monitor/monitor-internal.h
    M monitor/monitor.c
    M monitor/qmp.c
    M tests/qtest/libqtest.c
    M tests/qtest/libqtest.h
    M tests/qtest/migration-test.c

  Log Message:
  -----------
  monitor: Stop removing non-duplicated fds

monitor_fdsets_cleanup() currently has three responsibilities:

1- Remove the fds that have been marked for removal(->removed=true) by
   qmp_remove_fd(). This is overly complicated, but ok.

2- Remove any file descriptors that have been passed into QEMU and
   never duplicated[1,2]. A file descriptor without duplicates
   indicates that no part of QEMU has made use of it. This is
   problematic because the current implementation does it only if the
   guest is not running and the monitor is closed.

3- Remove/free fdsets that have become empty due to the above
   removals. This is ok.

The scenario described in (2) is starting to show some cracks now that
we're trying to consume fds from the migration code:

- Doing cleanup every time the last monitor connection closes works to
  reap unused fds, but also has the side effect of forcing the
  management layer to pass the file descriptors again in case of a
  disconnect/re-connect, if that happened to be the only monitor
  connection.

  Another side effect is that removing an fd with qmp_remove_fd() is
  effectively delayed until the last monitor connection closes.

  The usage of mon_refcount is also problematic because it's racy.

- Checking runstate_is_running() skips the cleanup unless the VM is
  running and avoids premature cleanup of the fds, but also has the
  side effect of blocking the legitimate removal of an fd via
  qmp_remove_fd() if the VM happens to be in another state.

  This affects qmp_remove_fd() and qmp_query_fdsets() in particular
  because requesting a removal at a bad time (guest stopped) might
  cause an fd to never be removed, or to be removed at a much later
  point in time, causing the query command to continue showing the
  supposedly removed fd/fdset.

Note that file descriptors that *have* been duplicated are owned by
the code that uses them and will be removed after qemu_close() is
called. Therefore we've decided that the best course of action to
avoid the undesired side-effects is to stop managing non-duplicated
file descriptors.

1- efb87c1697 ("monitor: Clean up fd sets on monitor disconnect")
2- ebe52b592d ("monitor: Prevent removing fd from set during init")

Reviewed-by: Peter Xu <pet...@redhat.com>
[fix logic mistake: s/fdset_free/fdset_free_if_empty]
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 881172f3f9dfe5764e7cb8983e5a660b93224d0c
      
https://github.com/qemu/qemu/commit/881172f3f9dfe5764e7cb8983e5a660b93224d0c
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M monitor/fds.c

  Log Message:
  -----------
  monitor: Simplify fdset and fd removal

Remove fds right away instead of setting the ->removed flag. We don't
need the extra complexity of having a cleanup function reap the
removed entries at a later time.

Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 960f29b347ad34a53580fa822083d51ba7851b7b
      
https://github.com/qemu/qemu/commit/960f29b347ad34a53580fa822083d51ba7851b7b
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M include/monitor/monitor.h
    M monitor/fds.c
    M stubs/fdset.c
    M util/osdep.c

  Log Message:
  -----------
  monitor: Report errors from monitor_fdset_dup_fd_add

I'm keeping the EACCES because callers expect to be able to look at
errno.

Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 46cec74c1b71973756d8960a305bae1491ae6259
      
https://github.com/qemu/qemu/commit/46cec74c1b71973756d8960a305bae1491ae6259
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M io/channel-file.c

  Log Message:
  -----------
  io: Stop using qemu_open_old in channel-file

We want to make use of the Error object to report fdset errors from
qemu_open_internal() and passing the error pointer to qemu_open_old()
would require changing all callers. Move the file channel to the new
API instead.

Reviewed-by: Daniel P. Berrangé <berra...@redhat.com>
Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: b43b61d5bee522dadf44b472af71aab7235c13d5
      
https://github.com/qemu/qemu/commit/b43b61d5bee522dadf44b472af71aab7235c13d5
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M include/qemu/osdep.h
    M migration/migration-hmp-cmds.c
    M migration/options.c
    M migration/options.h
    M qapi/migration.json
    M util/osdep.c

  Log Message:
  -----------
  migration: Add direct-io parameter

Add the direct-io migration parameter that tells the migration code to
use O_DIRECT when opening the migration stream file whenever possible.

This is currently only used with the mapped-ram migration that has a
clear window guaranteed to perform aligned writes.

Acked-by: Markus Armbruster <arm...@redhat.com>
Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 9d70239e56fadaa3571b8a7998a323ced52e8e76
      
https://github.com/qemu/qemu/commit/9d70239e56fadaa3571b8a7998a323ced52e8e76
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M migration/file.c
    M migration/file.h
    M migration/migration.c

  Log Message:
  -----------
  migration/multifd: Add direct-io support

When multifd is used along with mapped-ram, we can take benefit of a
filesystem that supports the O_DIRECT flag and perform direct I/O in
the multifd threads. This brings a significant performance improvement
because direct-io writes bypass the page cache which would otherwise
be thrashed by the multifd data which is unlikely to be needed again
in a short period of time.

To be able to use a multifd channel opened with O_DIRECT, we must
ensure that a certain aligment is used. Filesystems usually require a
block-size alignment for direct I/O. The way to achieve this is by
enabling the mapped-ram feature, which already aligns its I/O properly
(see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c).

By setting O_DIRECT on the multifd channels, all writes to the same
file descriptor need to be aligned as well, even the ones that come
from outside multifd, such as the QEMUFile I/O from the main migration
code. This makes it impossible to use the same file descriptor for the
QEMUFile and for the multifd channels. The various flags and metadata
written by the main migration code will always be unaligned by virtue
of their small size. To workaround this issue, we'll require a second
file descriptor to be used exclusively for direct I/O.

The second file descriptor can be obtained by QEMU by re-opening the
migration file (already possible), or by being provided by the user or
management application (support to be added in future patches).

Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 408d295da82103df833f2f0443442d8aed880cb0
      
https://github.com/qemu/qemu/commit/408d295da82103df833f2f0443442d8aed880cb0
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M tests/qtest/migration-helpers.c
    M tests/qtest/migration-helpers.h
    M tests/qtest/migration-test.c

  Log Message:
  -----------
  tests/qtest/migration: Add tests for file migration with direct-io

The tests are only allowed to run in systems that know about the
O_DIRECT flag and in filesystems which support it.

Note: this also brings back migrate_set_parameter_bool() which went
away when we removed the compression tests. I copied it verbatim.

Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 99c147e2f53726290bbdde795b6efbb4d9138657
      
https://github.com/qemu/qemu/commit/99c147e2f53726290bbdde795b6efbb4d9138657
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M monitor/fds.c

  Log Message:
  -----------
  monitor: fdset: Match against O_DIRECT

We're about to enable the use of O_DIRECT in the migration code and
due to the alignment restrictions imposed by filesystems we need to
make sure the flag is only used when doing aligned IO.

The migration will do parallel IO to different regions of a file, so
we need to use more than one file descriptor. Those cannot be obtained
by duplicating (dup()) since duplicated file descriptors share the
file status flags, including O_DIRECT. If one migration channel does
unaligned IO while another sets O_DIRECT to do aligned IO, the
filesystem would fail the unaligned operation.

The add-fd QMP command along with the fdset code are specifically
designed to allow the user to pass a set of file descriptors with
different access flags into QEMU to be later fetched by code that
needs to alternate between those flags when doing IO.

Extend the fdset matching to behave the same with the O_DIRECT flag.

Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 8d60280e4f1621e13aea4aa593b5bc3e2af34e9d
      
https://github.com/qemu/qemu/commit/8d60280e4f1621e13aea4aa593b5bc3e2af34e9d
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M docs/devel/migration/main.rst
    M docs/devel/migration/mapped-ram.rst

  Log Message:
  -----------
  migration: Add documentation for fdset with multifd + file

With the last few changes to the fdset infrastructure, we now allow
multifd to use an fdset when migrating to a file. This is useful for
the scenario where the management layer wants to have control over the
migration file.

By receiving the file descriptors directly, QEMU can delegate some
high level operating system operations to the management layer (such
as mandatory access control). The management layer might also want to
add its own headers before the migration stream.

Document the "file:/dev/fdset/#" syntax for the multifd migration with
mapped-ram. The requirements for the fdset mechanism are:

- the fdset must contain two fds that are not duplicates between
  themselves;

- if direct-io is to be used, exactly one of the fds must have the
  O_DIRECT flag set;

- the file must be opened with WRONLY on the migration source side;

- the file must be opened with RDONLY on the migration destination
  side.

Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 31a5a3032eb3d62e045e18c80658e5e8f5341cda
      
https://github.com/qemu/qemu/commit/31a5a3032eb3d62e045e18c80658e5e8f5341cda
  Author: Fabiano Rosas <faro...@suse.de>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M tests/qtest/migration-test.c

  Log Message:
  -----------
  tests/qtest/migration: Add a test for mapped-ram with passing of fds

Add a multifd test for mapped-ram with passing of fds into QEMU. This
is how libvirt will consume the feature.

There are a couple of details to the fdset mechanism:

- multifd needs two distinct file descriptors (not duplicated with
  dup()) so it can enable O_DIRECT only on the channels that do
  aligned IO. The dup() system call creates file descriptors that
  share status flags, of which O_DIRECT is one.

- the open() access mode flags used for the fds passed into QEMU need
  to match the flags QEMU uses to open the file. Currently O_WRONLY
  for src and O_RDONLY for dst.

Note that fdset code goes under _WIN32 because fd passing is not
supported on Windows.

Reviewed-by: Peter Xu <pet...@redhat.com>
[brought back the qmp_remove_fd() call at the end of the tests]
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 637280aeb242517ede480aa2d5ba1c29d41eac11
      
https://github.com/qemu/qemu/commit/637280aeb242517ede480aa2d5ba1c29d41eac11
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration/multifd: Avoid the final FLUSH in complete()

We always do the flush when finishing one round of scan, and during
complete() phase we should scan one more round making sure no dirty page
existed.  In that case we shouldn't need one explicit FLUSH at the end of
complete(), as when reaching there all pages should have been flushed.

Reviewed-by: Fabiano Rosas <faro...@suse.de>
Tested-by: Fabiano Rosas <faro...@suse.de>
Signed-off-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 60ce47675d74ddae3f13a32767d097d9fecbda4b
      
https://github.com/qemu/qemu/commit/60ce47675d74ddae3f13a32767d097d9fecbda4b
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M migration/colo.c
    M migration/migration.c
    M migration/multifd.c
    M migration/postcopy-ram.c
    M migration/savevm.c

  Log Message:
  -----------
  migration: Rename thread debug names

The postcopy thread names on dest QEMU are slightly confusing, partly I'll
need to blame myself on 36f62f11e4 ("migration: Postcopy preemption
preparation on channel creation").  E.g., "fault-fast" reads like a fast
version of "fault-default", but it's actually the fast version of
"postcopy/listen".

Taking this chance, rename all the migration threads with proper rules.
Considering we only have 15 chars usable, prefix all threads with "mig/",
meanwhile identify src/dst threads properly this time.  So now most thread
names will look like "mig/DIR/xxx", where DIR will be "src"/"dst", except
the bg-snapshot thread which doesn't have a direction.

For multifd threads, making them "mig/{src|dst}/{send|recv}_%d".

We used to have "live_migration" thread for a very long time, now it's
called "mig/src/main".  We may hope to have "mig/dst/main" soon but not
yet.

Reviewed-by: Fabiano Rosas <faro...@suse.de>
Reviewed-by: Zhijian Li (Fujitsu) <lizhij...@fujitsu.com>
Signed-off-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: a5c24e13e9f176901058b460e61425756322f3e8
      
https://github.com/qemu/qemu/commit/a5c24e13e9f176901058b460e61425756322f3e8
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M migration/migration.c
    M migration/migration.h

  Log Message:
  -----------
  migration: Use MigrationStatus instead of int

QEMU uses "int" in most cases even if it stores MigrationStatus.  I don't
know why, so let's try to do that right and see what blows up..

Reviewed-by: Fabiano Rosas <faro...@suse.de>
Signed-off-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 4dd5f7b8d568116b3ce594b0055a47c6db50f49c
      
https://github.com/qemu/qemu/commit/4dd5f7b8d568116b3ce594b0055a47c6db50f49c
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M migration/migration.c

  Log Message:
  -----------
  migration: Cleanup incoming migration setup state change

Destination QEMU can setup incoming ports for two purposes: either a fresh
new incoming migration, in which QEMU will switch to SETUP for channel
establishment, or a paused postcopy migration, in which QEMU will stay in
POSTCOPY_PAUSED until kicking off the RECOVER phase.

Now the state machine worked on dest node for the latter, only because
migrate_set_state() implicitly will become a noop if the current state
check failed.  It wasn't clear at all.

Clean it up by providing a helper migration_incoming_state_setup() doing
proper checks over current status.  Postcopy-paused will be explicitly
checked now, and then we can bail out for unknown states.

Reviewed-by: Fabiano Rosas <faro...@suse.de>
Signed-off-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 4146b77ec7640d3c30d42558e13423594b114385
      
https://github.com/qemu/qemu/commit/4146b77ec7640d3c30d42558e13423594b114385
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M migration/migration.c
    M migration/postcopy-ram.c
    M migration/postcopy-ram.h
    M migration/savevm.c
    M qapi/migration.json

  Log Message:
  -----------
  migration/postcopy: Add postcopy-recover-setup phase

This patch adds a migration state on src called "postcopy-recover-setup".
The new state will describe the intermediate step starting from when the
src QEMU received a postcopy recovery request, until the migration channels
are properly established, but before the recovery process take place.

The request came from Libvirt where Libvirt currently rely on the migration
state events to detect migration state changes.  That works for most of the
migration process but except postcopy recovery failures at the beginning.

Currently postcopy recovery only has two major states:

  - postcopy-paused: this is the state that both sides of QEMU will be in
    for a long time as long as the migration channel was interrupted.

  - postcopy-recover: this is the state where both sides of QEMU handshake
    with each other, preparing for a continuation of postcopy which used to
    be interrupted.

The issue here is when the recovery port is invalid, the src QEMU will take
the URI/channels, noticing the ports are not valid, and it'll silently keep
in the postcopy-paused state, with no event sent to Libvirt.  In this case,
the only thing Libvirt can do is to poll the migration status with a proper
interval, however that's less optimal.

Considering that this is the only case where Libvirt won't get a
notification from QEMU on such events, let's add postcopy-recover-setup
state to mimic what we have with the "setup" state of a newly initialized
migration, describing the phase of connection establishment.

With that, postcopy recovery will have two paths to go now, and either path
will guarantee an event generated.  Now the events will look like this
during a recovery process on src QEMU:

  - Initially when the recovery is initiated on src, QEMU will go from
    "postcopy-paused" -> "postcopy-recover-setup".  Old QEMUs don't have
    this event.

  - Depending on whether the channel re-establishment is succeeded:

    - In succeeded case, src QEMU will move from "postcopy-recover-setup"
      to "postcopy-recover".  Old QEMUs also have this event.

    - In failure case, src QEMU will move from "postcopy-recover-setup" to
      "postcopy-paused" again.  Old QEMUs don't have this event.

This guarantees that Libvirt will always receive a notification for
recovery process properly.

One thing to mention is, such new status is only needed on src QEMU not
both.  On dest QEMU, the state machine doesn't change.  Hence the events
don't change either.  It's done like so because dest QEMU may not have an
explicit point of setup start.  E.g., it can happen that when dest QEMUs
doesn't use migrate-recover command to use a new URI/channel, but the old
URI/channels can be reused in recovery, in which case the old ports simply
can work again after the network routes are fixed up.

Add a new helper postcopy_is_paused() detecting whether postcopy is still
paused, taking RECOVER_SETUP into account too.  When using it on both
src/dst, a slight change is done altogether to always wait for the
semaphore before checking the status, because for both sides a sem_post()
will be required for a recovery.

Cc: Jiri Denemark <jdene...@redhat.com>
Cc: Prasad Pandit <ppan...@redhat.com>
Reviewed-by: Fabiano Rosas <faro...@suse.de>
Buglink: https://issues.redhat.com/browse/RHEL-38485
Signed-off-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 21e89f7ad526f0dddfc722e615bfb0fcdb705c87
      
https://github.com/qemu/qemu/commit/21e89f7ad526f0dddfc722e615bfb0fcdb705c87
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M docs/devel/migration/postcopy.rst

  Log Message:
  -----------
  migration/docs: Update postcopy recover session for SETUP phase

Firstly, the "Paused" state was added in the wrong place before. The state
machine section was describing PostcopyState, rather than MigrationStatus.
Drop the Paused state descriptions.

Then in the postcopy recover session, add more information on the state
machine for MigrationStatus in the lines.  Add the new RECOVER_SETUP phase.

Reviewed-by: Fabiano Rosas <faro...@suse.de>
Signed-off-by: Peter Xu <pet...@redhat.com>
[fix typo s/reconnects/reconnect]
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 0fd397359540a6622c5f2164e76fc2cefd811f2a
      
https://github.com/qemu/qemu/commit/0fd397359540a6622c5f2164e76fc2cefd811f2a
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M tests/qtest/migration-test.c

  Log Message:
  -----------
  tests/migration-tests: Drop most WIN32 ifdefs for postcopy failure tests

Most of them are not needed, we can stick with one ifdef inside
postcopy_recover_fail() so as to cover the scm right tricks only.
The tests won't run on windows anyway due to has_uffd always false.

Reviewed-by: Fabiano Rosas <faro...@suse.de>
Signed-off-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: cd313b66f203381f2f2f984d5155d7942d26725d
      
https://github.com/qemu/qemu/commit/cd313b66f203381f2f2f984d5155d7942d26725d
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M tests/qtest/migration-helpers.c
    M tests/qtest/migration-test.c

  Log Message:
  -----------
  tests/migration-tests: Always enable migration events

Libvirt should always enable it, so it'll be nice qtest also cover that for
all tests on both sides.  migrate_incoming_qmp() used to enable it only on
dst, now we enable them on both, as we'll start to sanity check events even
on the src QEMU.

We'll need to leave the one in migrate_incoming_qmp(), because
virtio-net-failover test uses that one only, and it relies on the events to
work.

Signed-off-by: Peter Xu <pet...@redhat.com>
Reviewed-by: Fabiano Rosas <faro...@suse.de>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: d444e5673c223241bd2edbc207b02cc1b2114b71
      
https://github.com/qemu/qemu/commit/d444e5673c223241bd2edbc207b02cc1b2114b71
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M tests/qtest/migration-helpers.c
    M tests/qtest/migration-helpers.h

  Log Message:
  -----------
  tests/migration-tests: migration_event_wait()

Introduce a small helper to wait for a migration event, generalized from
the incoming migration path.  Make the helper easier to use by allowing it
to keep waiting until the expected event is received.

Signed-off-by: Peter Xu <pet...@redhat.com>
Reviewed-by: Fabiano Rosas <faro...@suse.de>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 8dbd24d3aa6d67b2d3576da016fb631fd1edfc2c
      
https://github.com/qemu/qemu/commit/8dbd24d3aa6d67b2d3576da016fb631fd1edfc2c
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M tests/qtest/migration-test.c

  Log Message:
  -----------
  tests/migration-tests: Verify postcopy-recover-setup status

Making sure the postcopy-recover-setup status is present in the postcopy
failure unit test.  Note that it only applies to src QEMU not dest.

Signed-off-by: Peter Xu <pet...@redhat.com>
Reviewed-by: Fabiano Rosas <faro...@suse.de>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 6cf56a87baf8b99c4296a943d220eb8276ca035a
      
https://github.com/qemu/qemu/commit/6cf56a87baf8b99c4296a943d220eb8276ca035a
  Author: Peter Xu <pet...@redhat.com>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M tests/qtest/migration-test.c

  Log Message:
  -----------
  tests/migration-tests: Cover postcopy failure on reconnect

Make sure there will be an event for postcopy recovery, irrelevant of
whether the reconnect will success, or when the failure happens.

The added new case is to fail early in postcopy recovery, in which case it
didn't even reach RECOVER stage on src (and in real life it'll be the same
to dest, but the test case is just slightly more involved due to the dual
socketpair setup).

To do that, rename the postcopy_recovery_test_fail to reflect either stage
to fail, instead of a boolean.

Reviewed-by: Fabiano Rosas <faro...@suse.de>
Signed-off-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: 04b09de16d78cf2d163ca65d7c6d161bf2baceb6
      
https://github.com/qemu/qemu/commit/04b09de16d78cf2d163ca65d7c6d161bf2baceb6
  Author: Philippe Mathieu-Daudé <phi...@linaro.org>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M include/migration/vmstate.h

  Log Message:
  -----------
  migration: Remove unused VMSTATE_ARRAY_TEST() macro

Last use of VMSTATE_ARRAY_TEST() was removed in commit 46baa9007f
("migration/i386: Remove old non-softfloat 64bit FP support"), we
can safely get rid of it.

Signed-off-by: Philippe Mathieu-Daudé <phi...@linaro.org>
Reviewed-by: Li Zhijian <lizhij...@fujitsu.com>
Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>


  Commit: ffeddb979400b1580ad28acbee09b6f971c3912d
      
https://github.com/qemu/qemu/commit/ffeddb979400b1580ad28acbee09b6f971c3912d
  Author: Richard Henderson <richard.hender...@linaro.org>
  Date:   2024-06-21 (Fri, 21 Jun 2024)

  Changed paths:
    M docs/devel/migration/main.rst
    M docs/devel/migration/mapped-ram.rst
    M docs/devel/migration/postcopy.rst
    M include/migration/vmstate.h
    M include/monitor/monitor.h
    M include/qemu/osdep.h
    M io/channel-file.c
    M migration/colo.c
    M migration/file.c
    M migration/file.h
    M migration/migration-hmp-cmds.c
    M migration/migration.c
    M migration/migration.h
    M migration/multifd.c
    M migration/options.c
    M migration/options.h
    M migration/postcopy-ram.c
    M migration/postcopy-ram.h
    M migration/ram.c
    M migration/savevm.c
    M monitor/fds.c
    M monitor/hmp.c
    M monitor/monitor-internal.h
    M monitor/monitor.c
    M monitor/qmp.c
    M qapi/migration.json
    M stubs/fdset.c
    M tests/qtest/libqtest.c
    M tests/qtest/libqtest.h
    M tests/qtest/migration-helpers.c
    M tests/qtest/migration-helpers.h
    M tests/qtest/migration-test.c
    M util/osdep.c

  Log Message:
  -----------
  Merge tag 'migration-20240621-pull-request' of 
https://gitlab.com/farosas/qemu into staging

Migration pull request

- Fabiano's fix for fdset + file migration truncating the migration
  file

- Fabiano's fdset + direct-io support for mapped-ram

- Peter's various cleanups (multifd sync, thread names, migration
  states, tests)

- Peter's new migration state postcopy-recover-setup

- Philippe's unused vmstate macro cleanup

# -----BEGIN PGP SIGNATURE-----
#
# iQJEBAABCAAuFiEEqhtIsKIjJqWkw2TPx5jcdBvsMZ0FAmZ1vIsQHGZhcm9zYXNA
# c3VzZS5kZQAKCRDHmNx0G+wxnVZTEACdFIsQ/PJw2C9eeLNor5B5MNSEqUjxX0KN
# 6s/uTkJ/dcv+2PI92SzRCZ1dpR5e9AyjTFYbLc9tPRBIROEhlUaoc84iyEy0jCFU
# eJ65/RQbH5QHRpOZwbN5RmGwnapfOWHGTn3bpdrmSQTOAy8R2TPGY4SVYR+gamTn
# bAv1cAsrOOBUfCi8aqvSlmvuliOW0lzJdF4XHa3mAaigLoF14JdwUZdyIMP1mLDp
# /fllbHCKCvJ1vprE9hQmptBR9PzveJZOZamIVt96djJr5+C869+9PMCn3a5vxqNW
# b+/LhOZjac37Ecg5kgbq+cO1E4EXKC3zWOmDTw8kHUwp9oYNi1upwLdpHbAAZaQD
# /JmHKsExx9QuV8mrVyGBXMI92E6RrT54b1Bjcuo63gAP8p9JRRxGT22U3LghNbTm
# 1XcGPR3rswjT1yTgE6qAqAIMR+7X5MrJVWop9ub/lF5DQ1VYIwmlKSNdwDHFDhRq
# 0F1k2+EksNpcZ0BH2+3iFml7qKHLVupLQKTWcLdrlnQnTfSG3+yW7eyA5Mte79Qp
# nJPcHt8qBqUVQ9Uf/4490TM4Lrp+T+m16exIi0tISLaDXSVkFJnlowipSm+tQ7U3
# Sm68JWdWWEsXZVaMqJeBE8nA/hCoQDpo4hVdwftStI+NayXbRX/EgvPqrNAvwh+c
# i4AdHdn6hQ==
# =ZX0p
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 21 Jun 2024 10:46:51 AM PDT
# gpg:                using RSA key AA1B48B0A22326A5A4C364CFC798DC741BEC319D
# gpg:                issuer "faro...@suse.de"
# gpg: Good signature from "Fabiano Rosas <faro...@suse.de>" [unknown]
# gpg:                 aka "Fabiano Almeida Rosas <fabiano.ro...@suse.com>" 
[unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: AA1B 48B0 A223 26A5 A4C3  64CF C798 DC74 1BEC 319D

* tag 'migration-20240621-pull-request' of https://gitlab.com/farosas/qemu: (28 
commits)
  migration: Remove unused VMSTATE_ARRAY_TEST() macro
  tests/migration-tests: Cover postcopy failure on reconnect
  tests/migration-tests: Verify postcopy-recover-setup status
  tests/migration-tests: migration_event_wait()
  tests/migration-tests: Always enable migration events
  tests/migration-tests: Drop most WIN32 ifdefs for postcopy failure tests
  migration/docs: Update postcopy recover session for SETUP phase
  migration/postcopy: Add postcopy-recover-setup phase
  migration: Cleanup incoming migration setup state change
  migration: Use MigrationStatus instead of int
  migration: Rename thread debug names
  migration/multifd: Avoid the final FLUSH in complete()
  tests/qtest/migration: Add a test for mapped-ram with passing of fds
  migration: Add documentation for fdset with multifd + file
  monitor: fdset: Match against O_DIRECT
  tests/qtest/migration: Add tests for file migration with direct-io
  migration/multifd: Add direct-io support
  migration: Add direct-io parameter
  io: Stop using qemu_open_old in channel-file
  monitor: Report errors from monitor_fdset_dup_fd_add
  ...

Signed-off-by: Richard Henderson <richard.hender...@linaro.org>


Compare: https://github.com/qemu/qemu/compare/02d9c38236cf...ffeddb979400

To unsubscribe from these emails, change your notification settings at 
https://github.com/qemu/qemu/settings/notifications


Reply via email to