Hi Adam,
First of all, thanks to all of you for your time and help.
I'll try to explain once again for those who don't know the complete story.
I had a crash disaster Wednesday 11th of march, I don't yet the cause, the 
facts are that after "maybe" an electrical overload, the oVirt manager, which 
is for the moment not a physical server, but a VM became down, but not his host.
It wouldn't be a so critical, if we didn't lose the access to all the VMs of 
the Data Center, my two Hypervisors hosts was yet up, so I've suspected that 
something was broken with vdsm link.
I've rebooted the manager, but all the Data Center was unresponsive, and no way 
to have the DC coming back, I've also rebooted the hosts and the SAN bay 
without any changes.

I had a backup done by my new AcronisBackupAdvanced solution, but unlikely I 
also had an (not known according Acronis) issue to restore the VM, the 
Acronis'Team was on the subject without any solution (from March 12th) until I 
found a workaround last Friday March 27 and reach to restore the Manager from 
March 10th.

So during these long waiting days, I've tried some solutions to attempt to 
recover my Data Center, maybe I've done wrong things...
I've first tried to Update oVirt from 3.5.0 to 3.5.1.1 without any success, so 
I've decided to build a new manager from scratch using a clone of my oVirt 
Manager to see if I could use "Import Domain" option, I've tried on two Storage 
Domain Volumes : VOL-UNC-NAS-01 and VOL-UNC-PROD-02.
The result was that the import worked, but no VMs seen after the import.
I was already in touch with Maor Lipchuk to help me, but I couldn't retrieve 
any VMs.

We are now Friday 27th, I found the workaround to restore my Acronis Backup, 
and after some work and reinstallation of the hosts, I could finally find again 
my Data Center, the Storage Domains UP (All at this time), all the VMs.
When I've tried to start VMs, only 4 of them have gone UP, in fact only those 
contained on the volume : VOL-UNC-PROD-01, that I didn't try with the "Import 
Domain" option. I deduced that the problem came from the two SD  VOL-UNC-NAS-01 
and VOL-UNC-PROD-02.
I've decided to put them in maintenance to activate or detach them, but from 
this moment, the two volumes stay in Maintenance mode without any other way to 
change their state.

We are at this point now, I hope that you'll be able to find a solution.
Anyway, the mystery is always present, why the Data Center has gone down, I can 
understand that the Manager had a problem, but why all the VMs has gone at the 
same time without a way to recover them ???
Now, the most important thing is to recover my Data Center, but It will very 
important to find the cause of the disaster, It could compromise my project of 
my Private Cloud in my enterprise.

I hope that I were enough clear and comprehensive to all of you guys, but don't 
hesitate to contact me if you have any questions.

Thanks a lot again for your help






Alain VONDRA
Chargé d'exploitation des Systèmes d'Information
Direction Administrative et Financière
+33 1 44 39 77 76
UNICEF France
3 rue Duguay Trouin  75006 PARIS
www.unicef.fr




-----Message d'origine-----
De : Adam Litke [mailto:ali...@redhat.com]
Envoyé : mercredi 1 avril 2015 17:06
À : VONDRA Alain
Cc : Elad Ben Aharon; users@ovirt.org; Federico Simoncelli; Maor Lipchuk
Objet : Re: [ovirt-users] Storage domain not in pool issue

On 31/03/15 08:43 +0000, VONDRA Alain wrote:
>Hi,
>Here is the logs.
>Thanks

Federico, Maor: tldr; Can you offer some advice for recovering this block SD 
after a DC disaster?

Hi Alain,

After looking at your logs, it's clear that the metadata on the storage domain 
itself says that the domain is attached to pool
c58a44b1-1c98-450e-97e1-3347eeb28f86 while engine thinks the domain is attached 
to pool f422de63-8869-41ef-a782-8b0c9ee03c41.

Can you please explain the process you used to recover from your datacenter 
disaster?  My guess is you:
  1. Reinstalled the engine host with a blank oVirt DB
  2. Created a new data center
  3. Created a new master domain
  4. Attached some storage domains which were not attached at the time
     of your previous disaster
  5. Tried to attach sd:d7b9d7cc-f7d6-43c7-ae13-e720951657c9 which was
     attached to your old storage pool at the time of the disaster.

#5 failed because the metadata on the storage shows the old storage pool.  At 
this point I see two possible options to recover your storage.  PLEASE DO NOT 
DO ANYTHING YET (until we confirm what the best approach for recovery will be).

Option 1: Use the new import storage domain feature to import this domain into 
your new datacenter.

Option 2: Modify the storage domain metadata to remove the reference to the old 
storage pool.

I am adding some other oVirt storage experts to the thread in order to offer 
you the best advice.  Federico, Maor: can you offer some expert advice on this 
matter?

I did notice this wiki page which talks about clearing the storage pool 
metadata from an export domain.  Since this SD is iSCSI, it will be a bit more 
difficult to manually edit the md but I'd guess someone has a script or some 
instructions on how to do it.

--
Adam Litke
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to