Hi,

This is my first post to the list. I am happy to say that we have been using Ovirt for 6 months with a few bumps, but it's mostly been ok.

Until tonight that is...

I had to do a maintenance that required rebooting both of our Hypervisor nodes. Both of them run Fedora Core 18 and have been happy for months. After rebooting them tonight, they will not attach to the storage. If it matters, the storage is a server running LIO with a Fibre Channel target.

Vdsm log:

Thread-22::DEBUG::2013-09-19 21:57:09,392::misc::84::Storage.Misc.excCmd::(<lambda>) '/usr/bin/dd iflag=direct if=/dev/b358e46b-635b-4c0e-8e73-0a494602e21d/metadata bs=4096 count=1' (cwd None) Thread-22::DEBUG::2013-09-19 21:57:09,400::misc::84::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> = '1+0 records in\n1+0 records out\n4096 bytes (4.1 kB) copied, 0.000547161 s, 7.5 MB/s\n'; <rc> = 0 Thread-23::DEBUG::2013-09-19 21:57:16,587::lvm::368::OperationMutex::(_reloadvgs) Operation 'lvm reload operation' got the operation mutex Thread-23::DEBUG::2013-09-19 21:57:16,587::misc::84::Storage.Misc.excCmd::(<lambda>) u'/usr/bin/sudo -n /sbin/lvm vgs --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \\"a%360014055193f840cb3743f9befef7aa3%\\", \\"r%.*%\\" ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 } backup { retain_min = 50 retain_days = 0 } " --noheadings --units b --nosuffix --separator | -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50' (cwd None) Thread-23::DEBUG::2013-09-19 21:57:16,643::misc::84::Storage.Misc.excCmd::(<lambda>) FAILED: <err> = ' Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found\n'; <rc> = 5 Thread-23::WARNING::2013-09-19 21:57:16,649::lvm::373::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] [' Volume group "6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50" not found'] Thread-23::DEBUG::2013-09-19 21:57:16,649::lvm::397::OperationMutex::(_reloadvgs) Operation 'lvm reload operation' released the operation mutex Thread-23::ERROR::2013-09-19 21:57:16,650::domainMonitor::208::Storage.DomainMonitorThread::(_monitorDomain) Error while collecting domain 6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50 monitoring information
Traceback (most recent call last):
File "/usr/share/vdsm/storage/domainMonitor.py", line 182, in _monitorDomain
    self.domain = sdCache.produce(self.sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 97, in produce
    domain.getRealDomain()
  File "/usr/share/vdsm/storage/sdc.py", line 52, in getRealDomain
    return self._cache._realProduce(self._sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 121, in _realProduce
    domain = self._findDomain(sdUUID)
  File "/usr/share/vdsm/storage/sdc.py", line 152, in _findDomain
    raise se.StorageDomainDoesNotExist(sdUUID)
StorageDomainDoesNotExist: Storage domain does not exist: (u'6cf7e7e9-3ae5-4645-a29c-fb17ecb38a50',)

vgs output (Note that I don't know what the device (Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z is) :

[root@node01 vdsm]# vgs
  Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
  VG                                   #PV #LV #SN Attr   VSize   VFree
  b358e46b-635b-4c0e-8e73-0a494602e21d   1  39   0 wz--n-   8.19t  5.88t
  build                                  2   2   0 wz-pn- 299.75g 16.00m
  fedora                                 1   3   0 wz--n- 557.88g     0

lvs output:

[root@node01 vdsm]# lvs
  Couldn't find device with uuid Wy3Ymi-J7bJ-hVxg-sg3L-F5Gv-MQmz-Utwv7z.
LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert 0b8cca47-313f-48da-84f2-154810790d5a b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g 0f6f7572-8797-4d84-831b-87dbc4e1aa48 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g 19a1473f-c375-411f-9a02-c6054b9a28d2 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 50.00g 221144dc-51dc-46ae-9399-c0b8e030f38a b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g 2386932f-5f68-46e1-99a4-e96c944ac21b b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g 3e027010-931b-43d6-9c9f-eeeabbdcd47a b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 2.00g 4257ccc2-94d5-4d71-b21a-c188acbf7ca1 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 200.00g 4979b2a4-04aa-46a1-be0d-f10be0a1f587 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g 4e1b8a1a-1704-422b-9d79-60f15e165cb7 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g 70bce792-410f-479f-8e04-a2a4093d3dfb b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g 791f6bda-c7eb-4d90-84c1-d7e33e73de62 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g 818ad6bc-8da2-4099-b38a-8c5b52f69e32 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 120.00g 861c9c44-fdeb-43cd-8e5c-32c00ce3cd3d b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g 86b69521-14db-43d1-801f-9d21f0a0e00f b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g 8a578e50-683d-47c3-af41-c7e508d493e8 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g 90463a7a-ecd4-4838-bc91-adccf99d9997 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g 9170a33d-3bdf-4c15-8e6b-451622c8093b b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 80.00g 964b9e32-c1ee-4152-a05b-0c43815f5ea6 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g 975c0a26-f699-4351-bd27-dd7621eac6bd b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g 9ec24f39-8b32-4247-bfb4-4b7f2cf86d9d b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g a4f303bf-6e89-43c3-a801-046920cb24d6 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g a7115874-0f1c-4f43-ab3a-a6026ad99013 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g b1fb5597-a3bb-4e4b-b73f-d1752cc576cb b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g bc28d7c6-a14b-4398-8166-ac2f25b17312 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g bc72da88-f5fd-4f53-9c2c-af2fcd14d117 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g c2c1ba71-c938-4d71-876a-1bba89a5d8a9 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g c54eb342-b79b-45fe-8117-aab7137f5f9d b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g c892f8b5-fadc-4774-a355-32655512a462 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g c9f636ce-efed-495d-9a29-cfaac1f289d3 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g cb657c62-44c8-43dd-8ea2-cbf5927cff72 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g cdba05ac-5f68-4213-827b-3d3518c67251 b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g d98f3bfc-55b0-44f7-8a39-cb0920762cba b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 100.00g e0708bc4-19df-4d48-a0e7-682070634dea b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 40.00g ids b358e46b-635b-4c0e-8e73-0a494602e21d -wi-ao--- 128.00m inbox b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 128.00m leases b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 2.00g master b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 1.00g metadata b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 512.00m outbox b358e46b-635b-4c0e-8e73-0a494602e21d -wi-a---- 128.00m root build -wi-----p 298.74g swap_1 build -wi-----p 1020.00m home fedora -wi-ao--- 503.88g root fedora -wi-ao--- 50.00g swap fedora -wi-ao--- 4.00g


The strange thing is that I can see all of the LVM volumes for the VMs. Both servers see the storage just fine. This error has me baffled and it's a total show stopper since the cluster will not come back up.

If anyone can help, that would be very appreciated.

Thanks,

Dan
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to