Re: [libvirt] [PATCH] qemu: Fix regression in snapshot-revert

2019-09-10 Thread Daniel Henrique Barboza



On 9/10/19 3:59 PM, Eric Blake wrote:

On 9/9/19 5:17 PM, Daniel Henrique Barboza wrote:


On 9/9/19 5:52 PM, Eric Blake wrote:

Commit f10562799 introduced a regression: if reverting to a snapshot
fails early (such as when we refuse to revert to an external
snapshot), we lose track of the domain's current snapshot.

See: https://bugzilla.redhat.com/1738747
Signed-off-by: Eric Blake 
---

I don't understand why f10562799 broke this code like this - tried
looking the changes made in the commit and the "if (snap)" was
there since a long time, no changes were made in the 'snap'
variable as well - but this change didn't break anything else, so:

Reviewed-by: Daniel Henrique Barboza 

I'll update the commit message to give more details:

Before that patch, we maintained the notion of the current snapshot in
two places: vm->current_snapshot, and snap->def->current.  If you have a
domain with two snapshots, s1 and s2 (where s2 is current) and want to
revert to s1, the old code cleared the def->current flag on s2 before
noticing that reverting to s1 was not possible, but at least left
vm->current_snapshot unchanged.  That meant we had a _different_ bug -
the code was inconsistent on whether the domain had a current snapshot
(as long as libvirtd didn't shut down, vm->current_snapshot ruled, so s2
was still claimed as current; but if you restart libvirtd, the XML for
s2 didn't track that it was current, so after restart you'd have no
current snapshot).  After that patch, the code was unconditionally
changing vm->current_snapshot to NULL (but at least was consistent
everywhere else that the XML never got out of sync with the notion of
which snapshot was current).  It also didn't help that after that patch,
the code clearing the snapshot occurs twice in the function - once right
after determining that the early checks have succeeded, the other
unconditionally on all failure paths.

So the fix is as simple as removing the unconditional clearing of s2 as
the current snapshot in the cleanup code, in favor of the earlier
clearing that happens only after the early checks succeed.


Thanks for the explanation!





ps: I think adding a test case for this failure (in tests/virsh-snapshot ?)
is worth considering, especially considering that we changes from
from Maxiwell (adding an inactive XML to the snapshot) that can
add up more complexity in the snapshot mechanics.

I tried. But the test driver doesn't forbid 'snapshot-revert' on
external snapshots, and also doesn't try to write state to XML files, so
it never copied the problematic code from the qemu driver that could
even trigger the bug.


I see. But now that I understood what you did and how the code is
neater, I changed my mind - I don't think Maxiwell's changes are going to
be too hard of a hit there. We can enhance the test_driver to
behave like the regular driver in this case of external snapshots, and
add a test case for this scenario, in another day.


Thanks,


DHB





   src/qemu/qemu_driver.c | 2 --
   1 file changed, 2 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index b28a26c3d6..093b15f500 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -16941,8 +16941,6 @@
qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
   virDomainSnapshotSetCurrent(vm->snapshots, NULL);
   ret = -1;
   }
-    } else if (snap) {
-    virDomainSnapshotSetCurrent(vm->snapshots, NULL);
   }
   if (ret == 0 && config && vm->persistent &&
   !(ret = virDomainSaveConfig(cfg->configDir, driver->caps,




--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: Fix regression in snapshot-revert

2019-09-10 Thread Eric Blake
On 9/9/19 5:17 PM, Daniel Henrique Barboza wrote:
> 
> 
> On 9/9/19 5:52 PM, Eric Blake wrote:
>> Commit f10562799 introduced a regression: if reverting to a snapshot
>> fails early (such as when we refuse to revert to an external
>> snapshot), we lose track of the domain's current snapshot.
>>
>> See: https://bugzilla.redhat.com/1738747
>> Signed-off-by: Eric Blake 
>> ---
> 
> I don't understand why f10562799 broke this code like this - tried
> looking the changes made in the commit and the "if (snap)" was
> there since a long time, no changes were made in the 'snap'
> variable as well - but this change didn't break anything else, so:
> 
> Reviewed-by: Daniel Henrique Barboza 

I'll update the commit message to give more details:

Before that patch, we maintained the notion of the current snapshot in
two places: vm->current_snapshot, and snap->def->current.  If you have a
domain with two snapshots, s1 and s2 (where s2 is current) and want to
revert to s1, the old code cleared the def->current flag on s2 before
noticing that reverting to s1 was not possible, but at least left
vm->current_snapshot unchanged.  That meant we had a _different_ bug -
the code was inconsistent on whether the domain had a current snapshot
(as long as libvirtd didn't shut down, vm->current_snapshot ruled, so s2
was still claimed as current; but if you restart libvirtd, the XML for
s2 didn't track that it was current, so after restart you'd have no
current snapshot).  After that patch, the code was unconditionally
changing vm->current_snapshot to NULL (but at least was consistent
everywhere else that the XML never got out of sync with the notion of
which snapshot was current).  It also didn't help that after that patch,
the code clearing the snapshot occurs twice in the function - once right
after determining that the early checks have succeeded, the other
unconditionally on all failure paths.

So the fix is as simple as removing the unconditional clearing of s2 as
the current snapshot in the cleanup code, in favor of the earlier
clearing that happens only after the early checks succeed.

> 
> 
> ps: I think adding a test case for this failure (in tests/virsh-snapshot ?)
> is worth considering, especially considering that we changes from
> from Maxiwell (adding an inactive XML to the snapshot) that can
> add up more complexity in the snapshot mechanics.

I tried. But the test driver doesn't forbid 'snapshot-revert' on
external snapshots, and also doesn't try to write state to XML files, so
it never copied the problematic code from the qemu driver that could
even trigger the bug.


> 
>>   src/qemu/qemu_driver.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index b28a26c3d6..093b15f500 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -16941,8 +16941,6 @@
>> qemuDomainRevertToSnapshot(virDomainSnapshotPtr snapshot,
>>   virDomainSnapshotSetCurrent(vm->snapshots, NULL);
>>   ret = -1;
>>   }
>> -    } else if (snap) {
>> -    virDomainSnapshotSetCurrent(vm->snapshots, NULL);
>>   }
>>   if (ret == 0 && config && vm->persistent &&
>>   !(ret = virDomainSaveConfig(cfg->configDir, driver->caps,
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: Fix regression in snapshot-revert

2019-09-09 Thread Daniel Henrique Barboza




On 9/9/19 5:52 PM, Eric Blake wrote:

Commit f10562799 introduced a regression: if reverting to a snapshot
fails early (such as when we refuse to revert to an external
snapshot), we lose track of the domain's current snapshot.

See: https://bugzilla.redhat.com/1738747
Signed-off-by: Eric Blake 
---


I don't understand why f10562799 broke this code like this - tried
looking the changes made in the commit and the "if (snap)" was
there since a long time, no changes were made in the 'snap'
variable as well - but this change didn't break anything else, so:

Reviewed-by: Daniel Henrique Barboza 


ps: I think adding a test case for this failure (in tests/virsh-snapshot ?)
is worth considering, especially considering that we changes from
from Maxiwell (adding an inactive XML to the snapshot) that can
add up more complexity in the snapshot mechanics.


Thanks,


DHB



  src/qemu/qemu_driver.c | 2 --
  1 file changed, 2 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index b28a26c3d6..093b15f500 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -16941,8 +16941,6 @@ qemuDomainRevertToSnapshot(virDomainSnapshotPtr 
snapshot,
  virDomainSnapshotSetCurrent(vm->snapshots, NULL);
  ret = -1;
  }
-} else if (snap) {
-virDomainSnapshotSetCurrent(vm->snapshots, NULL);
  }
  if (ret == 0 && config && vm->persistent &&
  !(ret = virDomainSaveConfig(cfg->configDir, driver->caps,


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list