On Wed Feb 25, 2026 at 4:18 PM CET, Fiona Ebner wrote:
> diff --git a/src/PVE/API2/Qemu.pm b/src/PVE/API2/Qemu.pm
> index 1f0864f5..47466513 100644
> --- a/src/PVE/API2/Qemu.pm
> +++ b/src/PVE/API2/Qemu.pm
> @@ -5399,7 +5399,9 @@ __PACKAGE__->register_method({
>              force => {
>                  type => 'boolean',
>                  description =>
> -                    "Allow to migrate VMs which use local devices. Only root 
> may use this option.",
> +                    "Allow to migrate VMs which use local devices and for 
> intra-cluster migration,"
> +                    . " configuration options not understood by the target. 
> Only root may use this"
> +                    . " option.",

HA-managed VMs are always migrated with force set as it was assumed to
be only used for local devices at the time [0]. This might need some
adaption so that LRM-initiated migrations won't cause problems for those
VMs that this patch series wants to fix.

[0] 
https://git.proxmox.com/?p=pve-ha-manager.git;a=blob;f=src/PVE/HA/Resources/PVEVM.pm;h=7586da84b7f19686b680d4e1434a17ffe1633d6d;hb=1a8d8bcef1934a43d37344caf965c082e55d451c#l116

As we might want to know which guests can be moved to which nodes in the
future quickly, e.g. for the load balancer to know which target nodes to
consider, I briefly considered whether it could also make sense to have
some config versioning, which is negotiated between the source and
target node (e.g. qemu-server on the source node is lower than the
target node, so the VM can be migrated), but that might be too strict,
especially for guests that don't even use the new config properties of
the more recent qemu-server version.

But maybe these load-balancing decisions can also be more coarse-grained
then this more fine-grained check for config compatibility and
implemented at a later time when it actually is needed.

What do you think?

>                  optional => 1,
>              },
>              migration_type => {
> diff --git a/src/PVE/QemuMigrate.pm b/src/PVE/QemuMigrate.pm
> index f7ec3227..901fe96d 100644
> --- a/src/PVE/QemuMigrate.pm
> +++ b/src/PVE/QemuMigrate.pm
> @@ -355,6 +355,33 @@ sub prepare {
>          my $cmd = [@{ $self->{rem_ssh} }, '/bin/true'];
>          eval { $self->cmd_quiet($cmd); };
>          die "Can't connect to destination address using public key\n" if $@;
> +
> +        if (!$self->{opts}->{force}) {
> +            # Fork a short-lived tunnel for checking the config. Later, the 
> proper tunnel with SSH
> +            # forwaring info is forked.
> +            my $tunnel = $self->fork_tunnel();
> +            # Compared to remote migration, which also does volume 
> activation, this only strictly
> +            # parses the config, so no large timeout is needed. 
> Unfortunately, mtunnel did not
> +            # indicate that a command is unknown, but not reply at all, so 
> the timeout must be very
> +            # low right now.
> +            # FIXME PVE 10 - bump timeout, the trade-off between delaying 
> backwards migration and
> +            # giving config check more time should now be in favor of config 
> checking
> +            eval {
> +                my $nodename = PVE::INotify::nodename();
> +                PVE::Tunnel::write_tunnel($tunnel, 3, "config $vmid 
> $nodename");
> +            };
> +            if (my $err = $@) {
> +                chomp($err);
> +                # if there is no reply, assume target did not know the 
> command yet
> +                if ($err =~ m/^no reply to command/) {
> +                    $self->log('info', "skipping strict configuration check 
> (target too old?)");
> +                } else {
> +                    die "$err - use --force to migrate regardless\n";

Though unlikely (I couldn't hit `systemctl stop sshd` on time on the
target node with a few tries ^^), write_tunnel(...) might fail with $err
that don't really explain why the migration failed. It might be better
to filter here or explicitly prepend that the strict config check failed
here and then add the full error message?

> +                }
> +            }
> +            eval { PVE::Tunnel::finish_tunnel($tunnel); };
> +            $self->log('warn', "failed to finish tunnel in prepare() - $@") 
> if $@;
> +        }
>      }
>  
>      return $running;
> diff --git a/src/test/MigrationTest/QemuMigrateMock.pm 
> b/src/test/MigrationTest/QemuMigrateMock.pm
> index df8b575a..170634de 100644
> --- a/src/test/MigrationTest/QemuMigrateMock.pm
> +++ b/src/test/MigrationTest/QemuMigrateMock.pm
> @@ -65,6 +65,10 @@ $tunnel_module->mock(
>              my $vmid = $1;
>              die "resuming wrong VM '$vmid'\n" if $vmid ne $test_vmid;
>              return;
> +        } elsif ($command =~ m/^config (\d+) (\S+)$/) {
> +            my ($vmid, $node) = ($1, $2);
> +            die "check config for wrong VM '$vmid'\n" if $vmid ne $test_vmid;
> +            return;
>          }
>          die "write_tunnel (mocked) - implement me: $command\n";
>      },
> @@ -73,7 +77,12 @@ $tunnel_module->mock(
>  my $qemu_migrate_module = Test::MockModule->new("PVE::QemuMigrate");
>  $qemu_migrate_module->mock(
>      fork_tunnel => sub {
> -        die "fork_tunnel (mocked) - implement me\n"; # currently no call 
> should lead here
> +        return {
> +            writer => "mocked",
> +            reader => "mocked",
> +            pid => 123456,
> +            version => 1,
> +        };
>      },
>      start_remote_tunnel => sub {
>          my ($self, $raddr, $rport, $ruri, $unix_socket_info) = @_;




Reply via email to