On Mon, Mar 20, 2017 at 10:12 AM, Paolo Margara <paolo.marg...@polito.it> wrote:
> Hi Yedidyah, > > Il 19/03/2017 11:55, Yedidyah Bar David ha scritto: > > On Sat, Mar 18, 2017 at 12:25 PM, Paolo Margara <paolo.marg...@polito.it> > wrote: > >> Hi list, > >> > >> I'm working on a system running on oVirt 3.6 and the Engine is reporting > >> the warning "The Hosted Engine Storage Domain doesn't exist. It should > >> be imported into the setup." repeatedly in the Events tab into the Admin > >> Portal. > >> > >> I've read into the list that Hosted Engine Storage Domain should be > >> imported automatically into the setup during the upgrade to 3.6 > >> (original setup was on 3.5), but this not happened while the > >> HostedEngine is correctly visible into the VM tab after the upgrade. > > Was the upgrade to 3.6 successful and clean? > The upgrade from 3.5 to 3.6 was successful, as every subsequent minor > release upgrades. I rechecked the upgrade logs I haven't seen any > relevant error. > One addition information: I'm currently running on CentOS 7 and also the > original setup was on this release version. > > > >> The Hosted Engine Storage Domain is on a dedicated gluster volume but > >> considering that, if I remember correctly, oVirt 3.5 at that time did > >> not support gluster as a backend for the HostedEngine at that time I had > >> installed the engine using gluster's NFS server using > >> 'localhost:/hosted-engine' as a mount point. > >> > >> Currently on every nodes I can read into the log of the > >> ovirt-hosted-engine-ha agent the following lines: > >> > >> MainThread::INFO::2017-03-17 > >> 14:04:17,773::hosted_engine::462::ovirt_hosted_engine_ha. > agent.hosted_engine.HostedEngine::(start_monitoring) > >> Current state EngineUp (score: 3400) > >> MainThread::INFO::2017-03-17 > >> 14:04:17,774::hosted_engine::467::ovirt_hosted_engine_ha. > agent.hosted_engine.HostedEngine::(start_monitoring) > >> Best remote host virtnode-0-1 (id: 2 > >> , score: 3400) > >> MainThread::INFO::2017-03-17 > >> 14:04:27,956::hosted_engine::613::ovirt_hosted_engine_ha. > agent.hosted_engine.HostedEngine::(_initialize_vdsm) > >> Initializing VDSM > >> MainThread::INFO::2017-03-17 > >> 14:04:28,055::hosted_engine::658::ovirt_hosted_engine_ha. > agent.hosted_engine.HostedEngine::(_initialize_storage_images) > >> Connecting the storage > >> MainThread::INFO::2017-03-17 > >> 14:04:28,078::storage_server::218::ovirt_hosted_engine_ha. > lib.storage_server.StorageServer::(connect_storage_server) > >> Connecting storage server > >> MainThread::INFO::2017-03-17 > >> 14:04:28,278::storage_server::222::ovirt_hosted_engine_ha. > lib.storage_server.StorageServer::(connect_storage_server) > >> Connecting storage server > >> MainThread::INFO::2017-03-17 > >> 14:04:28,398::storage_server::230::ovirt_hosted_engine_ha. > lib.storage_server.StorageServer::(connect_storage_server) > >> Refreshing the storage domain > >> MainThread::INFO::2017-03-17 > >> 14:04:28,822::hosted_engine::685::ovirt_hosted_engine_ha. > agent.hosted_engine.HostedEngine::(_initialize_storage_images) > >> Preparing images > >> MainThread::INFO::2017-03-17 > >> 14:04:28,822::image::126::ovirt_hosted_engine_ha.lib. > image.Image::(prepare_images) > >> Preparing images > >> MainThread::INFO::2017-03-17 > >> 14:04:29,308::hosted_engine::688::ovirt_hosted_engine_ha. > agent.hosted_engine.HostedEngine::(_initialize_storage_images) > >> Reloading vm.conf from the > >> shared storage domain > >> MainThread::INFO::2017-03-17 > >> 14:04:29,309::config::206::ovirt_hosted_engine_ha.agent. > hosted_engine.HostedEngine.config::(refresh_local_conf_file) > >> Trying to get a fresher copy > >> of vm configuration from the OVF_STORE > >> MainThread::WARNING::2017-03-17 > >> 14:04:29,567::ovf_store::104::ovirt_hosted_engine_ha.lib. > ovf.ovf_store.OVFStore::(scan) > >> Unable to find OVF_STORE > >> MainThread::ERROR::2017-03-17 > >> 14:04:29,691::config::235::ovirt_hosted_engine_ha.agent. > hosted_engine.HostedEngine.config::(refresh_local_conf_file) > >> Unable to get vm.conf from O > >> VF_STORE, falling back to initial vm.conf > > This is normal at your current state. > > > >> ...and the following lines into the logfile engine.log inside the Hosted > >> Engine: > >> > >> 2017-03-16 07:36:28,087 INFO > >> [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] > >> (org.ovirt.thread.pool-8-thread-38) [236d315c] Lock Acquired to object > >> 'EngineLock:{exclusiveLocks='[]', sharedLocks='null'}' > >> 2017-03-16 07:36:28,115 WARN > >> [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] > >> (org.ovirt.thread.pool-8-thread-38) [236d315c] CanDoAction of action > >> 'ImportHostedEngineStorageDomain' failed for user SYSTEM. Reasons: > >> VAR__ACTION__ADD,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_ > FAILED_STORAGE_DOMAIN_NOT_EXIST > > That's the thing to debug. Did you check vdsm logs on the hosts, near > > the time this happens? > Some moments before I saw the following lines into the vdsm.log of the > host that execute the hosted engine and that is the SPM, but I see the > same lines also on the other nodes: > > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,412::task::595::Storage.TaskManager.Task::(_updateState) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state init -> > state preparing > Thread-1746094::INFO::2017-03-16 > 07:36:00,413::logUtils::48::dispatcher::(wrapper) Run and protect: > getImagesList(sdUUID='3b5db584-5d21-41dc-8f8d-712ce9423a27', options=None) > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,413::resourceManager::199::Storage.ResourceManager. > Request::(__init__) > ResName=`Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27` > ReqID=`8ea3c7f3-8ccd-4127-96b1-ec97a3c7b8d4`::Request > was made in '/usr/share/vdsm/storage/hsm.py' line '3313' at > 'getImagesList' > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,413::resourceManager::545::Storage.ResourceManager: > :(registerResource) > Trying to register resource > 'Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27' for lock type 'shared' > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,414::resourceManager::604::Storage.ResourceManager: > :(registerResource) > Resource 'Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27' is free. Now > locking as 'shared' (1 active user) > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,414::resourceManager::239::Storage.ResourceManager. > Request::(grant) > ResName=`Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27` > ReqID=`8ea3c7f3-8ccd-4127-96b1-ec97a3c7b8d4`::Granted > request > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,414::task::827::Storage.TaskManager.Task::(resourceAcquired) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::_resourcesAcquired: > Storage.3b5db584-5d21-41dc-8f8d-712ce9423a27 (shared) > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,414::task::993::Storage.TaskManager.Task::(_decref) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 1 aborting False > Thread-1746094::ERROR::2017-03-16 > 07:36:00,415::task::866::Storage.TaskManager.Task::(_setError) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Unexpected error > Traceback (most recent call last): > File "/usr/share/vdsm/storage/task.py", line 873, in _run > return fn(*args, **kargs) > File "/usr/share/vdsm/logUtils.py", line 49, in wrapper > res = f(*args, **kwargs) > File "/usr/share/vdsm/storage/hsm.py", line 3315, in getImagesList > images = dom.getAllImages() > File "/usr/share/vdsm/storage/fileSD.py", line 373, in getAllImages > self.getPools()[0], > IndexError: list index out of range > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,415::task::885::Storage.TaskManager.Task::(_run) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Task._run: > ae5af1a1-207c-432d-acfa-f3e03e014ee6 > ('3b5db584-5d21-41dc-8f8d-712ce9423a27',) {} failed - stopping task > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,415::task::1246::Storage.TaskManager.Task::(stop) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::stopping in state preparing > (force False) > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,416::task::993::Storage.TaskManager.Task::(_decref) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 1 aborting True > Thread-1746094::INFO::2017-03-16 > 07:36:00,416::task::1171::Storage.TaskManager.Task::(prepare) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::aborting: Task is aborted: > u'list index out of range' - code 100 > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,416::task::1176::Storage.TaskManager.Task::(prepare) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Prepare: aborted: list > index out of range > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,416::task::993::Storage.TaskManager.Task::(_decref) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::ref 0 aborting True > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,416::task::928::Storage.TaskManager.Task::(_doAbort) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::Task._doAbort: force False > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,416::resourceManager::980::Storage.ResourceManager. > Owner::(cancelAll) > Owner.cancelAll requests {} > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,417::task::595::Storage.TaskManager.Task::(_updateState) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state preparing > -> state aborting > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,417::task::550::Storage.TaskManager.Task::(__state_aborting) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::_aborting: recover policy > none > Thread-1746094::DEBUG::2017-03-16 > 07:36:00,417::task::595::Storage.TaskManager.Task::(_updateState) > Task=`ae5af1a1-207c-432d-acfa-f3e03e014ee6`::moving from state aborting > -> state failed > > After that I tried to execute a simple query on storage domains using > vdsClient and I got the following information: > > # vdsClient -s 0 getStorageDomainsList > 3b5db584-5d21-41dc-8f8d-712ce9423a27 > 0966f366-b5ae-49e8-b05e-bee1895c2d54 > 35223b83-e0bd-4c8d-91a9-8c6b85336e7d > 2c3994e3-1f93-4f2a-8a0a-0b5d388a2be7 > # vdsClient -s 0 getStorageDomainInfo 3b5db584-5d21-41dc-8f8d-712ce9423a27 > uuid = 3b5db584-5d21-41dc-8f8d-712ce9423a27 > version = 3 > role = Regular > remotePath = localhost:/hosted-engine > Your issue is probably here: by design all the hosts of a single datacenter should be able to see all the storage domains including the hosted-engine one but if try to mount it as localhost:/hosted-engine this will not be possible. > type = NFS > class = Data > pool = [] > name = default > # vdsClient -s 0 getImagesList 3b5db584-5d21-41dc-8f8d-712ce9423a27 > list index out of range > > All other storage domains have the pool attribute defined, could be this > the issue? How can I assign to a pool the Hosted Engine Storage Domain? > This will be the result of the auto import process once feasible. > > > >> 2017-03-16 07:36:28,116 INFO > >> [org.ovirt.engine.core.bll.ImportHostedEngineStorageDomainCommand] > >> (org.ovirt.thread.pool-8-thread-38) [236d315c] Lock freed to object > >> 'EngineLock:{exclusiveLocks='[]', sharedLocks='null'}' > >> > >> How can I safely import the Hosted Engine Storage Domain into my setup? > >> In this situation is safe to upgrade to oVirt 4.0? > > I'd first try to solve this. > > > > What OS do you have on your hosts? Are they all upgraded to 3.6? > > > > See also: > > > > https://www.ovirt.org/documentation/how-to/hosted- > engine-host-OS-upgrade/ > > > > Best, > > > >> > >> Greetings, > >> Paolo > >> > >> _______________________________________________ > >> Users mailing list > >> Users@ovirt.org > >> http://lists.ovirt.org/mailman/listinfo/users > > > > > Greetings, > Paolo > _______________________________________________ > Users mailing list > Users@ovirt.org > http://lists.ovirt.org/mailman/listinfo/users >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users