Thnak you for reporting this issue because, I met exactly the same : FC storage domain and sometimes, many of my hosts (15 ) become sometimes unavailable without any apparent action on them. The issue message is : storage domain is unvailable. So it is a desaster when power management is activated because hosts reboot at the same time and all VMs go down without migrating. It happened to me two times, and the second time it was less a pity because I desactivated the power management. It may be a serious issue because host stay reacheable and lun is still okay when doing a lvs command. The workaround in this case is to restart the engine (restarting vdsm gives nothing) and then, all the hosts come up.

 * el6 engine on a separate KVM
 * implied el7 and el6 hosts
 * ovirt 3.5.1 and vdsm 4.16.10-8
 * 2 FC datacenter on two remote sites with the same engine and both
   are impacted


Le 23/03/2015 16:54, Jonas Israelsson a écrit :
Greetings.

Running oVirt 3.5 with a mix of NFS and FC Storage.

Engine running on a seperate KVM VM and Node installed with a pre 3.5 ovirt-node "ovirt-node-iso-3.5.0.ovirt35.20140912.el6 (Edited)"

I had some problems with my FC-Storage where the LUNS for a while became unavailable to my Ovirt-host. Everything is now up and running and those luns again are accessible by the host. The NFS domains goes back online but the FC does not.

Thread-22::DEBUG::2015-03-23 14:53:02,706::lvm::290::Storage.Misc.excCmd::(cmd) /usr/bin/sudo -n /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56 (cwd None)

Thread-24::DEBUG::2015-03-23 14:53:02,981::lvm::290::Storage.Misc.excCmd::(cmd) FAILED: <err> = ' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found\n Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56\n'; <rc> = 5

Thread-24::WARNING::2015-03-23 14:53:02,986::lvm::372::Storage.LVM::(_reloadvgs) lvm vgs failed: 5 [] [' Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found', ' Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56']


Running the command above manually does indeed give the same output:

# /sbin/lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 filter = [ '\''r|.*|'\'' ] } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56

  Volume group "29f9b165-3674-4384-a1d4-7aa87d923d56" not found
  Skipping volume group 29f9b165-3674-4384-a1d4-7aa87d923d56

What puzzles me is that those volume does exist.

lvm vgs
  VG                                   #PV #LV #SN Attr   VSize VFree
  22cf06d1-faca-4e17-ac78-d38b7fc300b1   1  13   0 wz--n- 999.62g 986.50g
  29f9b165-3674-4384-a1d4-7aa87d923d56   1   8   0 wz--n-  99.62g 95.50g
  HostVG                                 1   4   0 wz--n-  13.77g 52.00m


  --- Volume group ---
  VG Name               29f9b165-3674-4384-a1d4-7aa87d923d56
  System ID
  Format                lvm2
  Metadata Areas        2
  Metadata Sequence No  20
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                8
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               99.62 GiB
  PE Size               128.00 MiB
  Total PE              797
  Alloc PE / Size       33 / 4.12 GiB
  Free  PE / Size       764 / 95.50 GiB
  VG UUID               aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk

lvm vgs --config ' devices { preferred_names = ["^/dev/mapper/"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 obtain_device_list_from_udev=0 } global { locking_type=1 prioritise_write_locks=1 wait_for_locks=1 use_lvmetad=0 } backup { retain_min = 50 retain_days = 0 } ' --noheadings --units b --nosuffix --separator '|' --ignoreskippedcluster -o uuid,name,attr,size,free,extent_size,extent_count,free_count,tags,vg_mda_size,vg_mda_free,lv_count,pv_count,pv_name 29f9b165-3674-4384-a1d4-7aa87d923d56


aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk|29f9b165-3674-4384-a1d4-7aa87d923d56|wz--n-|106971529216|102542344192|134217728|797|764|MDT_LEASETIMESEC=60,MDT_CLASS=Data,MDT_VERSION=3,MDT_SDUUID=29f9b165-3674-4384-a1d4-7aa87d923d56,MDT_PV0=pv:36001405c94d80be2ed0482c91a1841b8&44&uuid:muHcYl-sobG-3LyY-jjfg-3fGf-1cHO-uDk7da&44&pestart:0&44&pecount:797&44&mapoffset:0,MDT_LEASERETRIES=3,MDT_VGUUID=aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk,MDT_IOOPTIMEOUTSEC=10,MDT_LOCKRENEWALINTERVALSEC=5,MDT_PHYBLKSIZE=512,MDT_LOGBLKSIZE=512,MDT_TYPE=FCP,MDT_LOCKPOLICY=,MDT_DESCRIPTION=Master,RHAT_storage_domain,MDT_POOL_SPM_ID=-1,MDT_POOL_DESCRIPTION=Elementary,MDT_POOL_SPM_LVER=-1,MDT_POOL_UUID=8c3c5df9-e8ff-4313-99c9-385b6c7d896b,MDT_MASTER_VERSION=10,MDT_POOL_DOMAINS=22cf06d1-faca-4e17-ac78-d38b7fc300b1:Active&44&c434ab5a-9d21-42eb-ba1b-dbd716ba3ed1:Active&44&96e62d18-652d-401a-b4b5-b54ecefa331c:Active&44&29f9b165-3674-4384-a1d4-7aa87d923d56:Active&44&1a0d3e5a-d2ad-4829-8ebd-ad3ff5463062:Active,MDT__SH A_CKSUM=7ea9af890755d96563cb7a736f8e3f46ea986f67,MDT_ROLE=Regular|134217728|67103744|8|1|/dev/sda


[root@patty vdsm]# vdsClient -s 0 getStorageDomainsList (Returns all but only the NFS-Domains)
c434ab5a-9d21-42eb-ba1b-dbd716ba3ed1
1a0d3e5a-d2ad-4829-8ebd-ad3ff5463062
a8fd9df0-48f2-40a2-88d4-7bf47fef9b07


engine=# select id,storage,storage_name,storage_domain_type from storage_domain_static ; id | storage | storage_name | storage_domain_type --------------------------------------+----------------------------------------+------------------------+--------------------- 072fbaa1-08f3-4a40-9f34-a5ca22dd1d74 | ceab03af-7220-4d42-8f5c-9b557f5d29af | ovirt-image-repository | 4 1a0d3e5a-d2ad-4829-8ebd-ad3ff5463062 | 6564a0b2-2f92-48de-b986-e92de7e28885 | ISO | 2 c434ab5a-9d21-42eb-ba1b-dbd716ba3ed1 | bb54b2b8-00a2-4b84-a886-d76dd70c3cb0 | Export | 3 22cf06d1-faca-4e17-ac78-d38b7fc300b1 | e43eRZ-HACv-YscJ-KNZh-HVwe-tAd2-0oGNHh | Hinken | 1 <---- 'GONE' 29f9b165-3674-4384-a1d4-7aa87d923d56 | aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk | Master | 1 <---- 'GONE' a8fd9df0-48f2-40a2-88d4-7bf47fef9b07 | 0299ca61-d68e-4282-b6c3-f6e14aef2688 | NFS-DATA | 0

When manually trying to activate one of the above domains the following is written to the engine.log

2015-03-23 16:37:27,193 INFO [org.ovirt.engine.core.bll.storage.SyncLunsInfoForBlockStorageDomainCommand] (org.ovirt.thread.pool-8-thread-42) [5f2bcbf9] Running command: SyncLunsInfoForBlockStorageDomainCommand internal: true. Entities affected : ID: 29f9b165-3674-4384-a1d4-7aa87d923d56 Type: Storage 2015-03-23 16:37:27,202 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand] (org.ovirt.thread.pool-8-thread-42) [5f2bcbf9] START, GetVGInfoVDSCommand(HostName = patty.elemementary.se, HostId = 38792a69-76f3-46d8-8620-9d4b9a5ec21f, VGID=aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk), log id: 6e6f6792 2015-03-23 16:37:27,404 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand] (org.ovirt.thread.pool-8-thread-28) [3258de6d] Failed in GetVGInfoVDS method 2015-03-23 16:37:27,404 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand] (org.ovirt.thread.pool-8-thread-28) [3258de6d] Command org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand return value

OneVGReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=506, mMessage=Volume Group does not exist: (u'vg_uuid: aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk',)]]

2015-03-23 16:37:27,406 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand] (org.ovirt.thread.pool-8-thread-28) [3258de6d] HostName = patty.elemementary.se 2015-03-23 16:37:27,407 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand] (org.ovirt.thread.pool-8-thread-28) [3258de6d] Command GetVGInfoVDSCommand(HostName = patty.elemementary.se, HostId = 38792a69-76f3-46d8-8620-9d4b9a5ec21f, VGID=aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetVGInfoVDS, error = Volume Group does not exist: (u'vg_uuid: aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk',), code = 506 2015-03-23 16:37:27,409 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand] (org.ovirt.thread.pool-8-thread-28) [3258de6d] FINISH, GetVGInfoVDSCommand, log id: 2edb7c0d 2015-03-23 16:37:27,410 ERROR [org.ovirt.engine.core.bll.storage.SyncLunsInfoForBlockStorageDomainCommand] (org.ovirt.thread.pool-8-thread-28) [3258de6d] Command org.ovirt.engine.core.bll.storage.SyncLunsInfoForBlockStorageDomainCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to GetVGInfoVDS, error = Volume Group does not exist: (u'vg_uuid: aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk',), code = 506 (Failed with error VolumeGroupDoesNotExist and code 506) 2015-03-23 16:37:27,413 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.ActivateStorageDomainVDSCommand] (org.ovirt.thread.pool-8-thread-28) [3258de6d] START, ActivateStorageDomainVDSCommand( storagePoolId = 8c3c5df9-e8ff-4313-99c9-385b6c7d896b, ignoreFailoverLimit = false, storageDomainId = 29f9b165-3674-4384-a1d4-7aa87d923d56), log id: 795253ee 2015-03-23 16:37:27,482 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand] (org.ovirt.thread.pool-8-thread-42) [5f2bcbf9] Failed in GetVGInfoVDS method 2015-03-23 16:37:27,482 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand] (org.ovirt.thread.pool-8-thread-42) [5f2bcbf9] Command org.ovirt.engine.core.vdsbroker.vdsbroker.GetVGInfoVDSCommand return value OneVGReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=506, mMessage=Volume Group does not exist: (u'vg_uuid: aAoOcw-d9YB-y9gP-Tp4M-S0UE-Aqpx-y6Z2Uk',)]]


Could someone (pretty please with sugar on top) point me in the right direction ?

Brgds Jonas

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to