Re: [ovirt-users] Can't remove snapshot

Greg Padgett Tue, 16 Feb 2016 13:52:42 -0800

On 02/16/2016 08:50 AM, Rik Theys wrote:

Hi,


I'm trying to determine the correct "bad_img" uuid in my case.

The VM has two snapshots:

* The "Active VM" snapshot which has a disk that has an actual size
that's 5GB larger than the virtual size. It has a creation date that
matches the timestamp at which I created the second snapshot. The "disk
snapshot id" for this snapshot ends with dc39.

* A "before jessie upgrade" snapshot that has status "illegal". It has
an actual size that's 2GB larger than the virtual size. The creation
date matches the date the VM was initialy created. The disk snapshot id
ends with 6249.

 From the above I conclude that the disk with id that ends with 6249 is
the "bad" img I need to specify.

Similar to what I wrote to Marcelo above in the thread, I'd recommendrunning the "VM disk info gathering tool" attached to [1]. It's the bestway to ensure the merge was completed and determine which image is the"bad" one that is no longer in use by any volume chains.

If indeed the "bad" image (whichever one it is) is no longer in use, thenit's possible the image wasn't successfully removed from storage. Thereare 2 ways to fix this:


  a) Run the db fixup script to remove the records for the merged image,
     and run the vdsm command by hand to remove it from storage.
  b) Adjust the db records so a merge retry would start at the right
     place, and re-run live merge.

Given that your merge retries were failing, option a) seems most likely tosucceed. The db fixup script is attached to [1]; as parameters you wouldneed to provide the vm name, snapshot name, and the id of the unused imageas verified by the disk info tool.

To remove the stale LV, the vdsm deleteVolume verb would then be run from`vdsClient` -- but note that this must be run _on the SPM host_. It willnot only perform lvremove, but also do housekeeping on other storagemetadata to keep everything consistent. For this verb I believe you'llneed to supply not only the unused image id, but also the pool, domain,and image group ids from your database queries.


I hope that helps.

Greg

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1306741


However, I grepped the output from 'lvs' on the SPM host of the cluster
and both disk id's are returned:

[root@amazone ~]# lvs | egrep 'cd39|6249'
   24d78600-22f4-44f7-987b-fbd866736249
a7ba2db3-517c-408a-8b27-ea45989d6416 -wi-ao----   34.00g

   81458622-aa54-4f2f-b6d8-75e7db36cd39
a7ba2db3-517c-408a-8b27-ea45989d6416 -wi-------    5.00g


I expected the "bad" img would no longer be found?

The SQL script only cleans up the database and not the logical volumes.
Would running the script not keep a stale LV around?

Also, from the lvs output it seems the "bad" disk is bigger than the
"good" one.

Is it possible the snapshot still needs to be merged?? If so, how can I
initiate that?

Regards,

Rik


On 02/16/2016 02:02 PM, Rik Theys wrote:

Hi Greg,


2016-02-09 21:30 GMT-03:00 Greg Padgett <gpadg...@redhat.com>:

On 02/09/2016 06:08 AM, Michal Skrivanek wrote:

On 03 Feb 2016, at 10:37, Rik Theys <rik.th...@esat.kuleuven.be> wrote:

I can see the snapshot in the "Disk snapshot" tab of the storage. It has
a status of "illegal". Is it OK to (try to) remove this snapshot? Will
this impact the running VM and/or disk image?



No, it’s not ok to remove it while live merge(apparently) is still ongoing
I guess that’s a live merge bug?



Indeed, this is bug 1302215.

I wrote a sql script to help with cleanup in this scenario, which you can
find attached to the bug along with a description of how to use it[1].

However, Rik, before trying that, would you be able to run the attached
script [2] (or just the db query within) and forward the output to me? I'd
like to make sure everything looks as it should before modifying the db
directly.


I ran the following query on the engine database:

select images.* from images join snapshots ON (images.vm_snapshot_id =
snapshots.snapshot_id)
join vm_static on (snapshots.vm_id = vm_static.vm_guid)
where vm_static.vm_name = 'lena' and snapshots.description='before
jessie upgrade';

The resulting output is:

               image_guid              |     creation_date      |    size
     |               it_guid                |               parentid
           | images
tatus |        lastmodified        |            vm_snapshot_id
   | volume_type | volume_format |            image_group_id            |
         _create_da
te          |         _update_date          | active |
volume_classification
--------------------------------------+------------------------+-------------+--------------------------------------+--------------------------------------+-------
------+----------------------------+--------------------------------------+-------------+---------------+--------------------------------------+-------------------
------------+-------------------------------+--------+-----------------------
  24d78600-22f4-44f7-987b-fbd866736249 | 2015-05-19 15:00:13+02 |
34359738368 | 00000000-0000-0000-0000-000000000000 |
00000000-0000-0000-0000-000000000000 |
     4 | 2016-01-30 08:45:59.998+01 |
4b4930ed-b52d-47ec-8506-245b7f144102 |           1 |             5 |
b2390535-744f-4c02-bdc8-5a897226554b | 2015-05-19 15:00:1
1.864425+02 | 2016-01-30 08:45:59.999422+01 | f      |                     1
(1 row)

Regards,

Rik


Thanks,
Greg

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1302215#c13
(Also note that the engine should be stopped before running this.)

[2] Arguments are the ovirt db name, db user, and the name of the vm you
were performing live merge on.

Thanks,
michal



Regards,

Rik

On 02/03/2016 10:26 AM, Rik Theys wrote:


Hi,

I created a snapshot of a running VM prior to an OS upgrade. The OS
upgrade has now been succesful and I would like to remove the snapshot.
I've selected the snapshot in the UI and clicked Delete to start the
task.

After a few minutes, the task has failed. When I click delete again on
the same snapshot, the failed message is returned after a few seconds.

  From browsing through the engine log (attached) it seems the snapshot


was correctly merged in the first try but something went wrong in the
finalizing fase. On retries, the log indicates the snapshot/disk image
no longer exists and the removal of the snapshot fails for this reason.

Is there any way to clean up this snapshot?

I can see the snapshot in the "Disk snapshot" tab of the storage. It has
a status of "illegal". Is it OK to (try to) remove this snapshot? Will
this impact the running VM and/or disk image?


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Can't remove snapshot

Reply via email to