Re: [ovirt-users] HostedEngine VM not visible, but running

cmc Fri, 30 Jun 2017 03:48:28 -0700

Just to clarify: you mean the host_id in
/etc/ovirt-hosted-engine/hosted-engine.conf should match the spm_id,
correct?


On Fri, Jun 30, 2017 at 9:47 AM, Martin Sivak <msi...@redhat.com> wrote:
> Hi,
>
> cleaning metadata won't help in this case. Try transferring the
> spm_ids you got from the engine to the proper hosted engine hosts so
> the hosted engine ids match the spm_ids. Then restart all hosted
> engine services. I would actually recommend restarting all hosts after
> this change, but I have no idea how many VMs you have running.
>
> Martin
>
> On Thu, Jun 29, 2017 at 8:27 PM, cmc <iuco...@gmail.com> wrote:
>> Tried running a 'hosted-engine --clean-metadata" as per
>> https://bugzilla.redhat.com/show_bug.cgi?id=1350539, since
>> ovirt-ha-agent was not running anyway, but it fails with the following
>> error:
>>
>> ERROR:ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine:Failed
>> to start monitoring domain
>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>> during domain acquisition
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Traceback (most recent
>> call last):
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 191, in _run_agent
>>     return action(he)
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>> line 67, in action_clean
>>     return he.clean(options.force_cleanup)
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 345, in clean
>>     self._initialize_domain_monitor()
>>   File 
>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>> line 823, in _initialize_domain_monitor
>>     raise Exception(msg)
>> Exception: Failed to start monitoring domain
>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>> during domain acquisition
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Trying to restart agent
>> WARNING:ovirt_hosted_engine_ha.agent.agent.Agent:Restarting agent, attempt 
>> '0'
>> ERROR:ovirt_hosted_engine_ha.agent.agent.Agent:Too many errors
>> occurred, giving up. Please review the log and consider filing a bug.
>> INFO:ovirt_hosted_engine_ha.agent.agent.Agent:Agent shutting down
>>
>> On Thu, Jun 29, 2017 at 6:10 PM, cmc <iuco...@gmail.com> wrote:
>>> Actually, it looks like sanlock problems:
>>>
>>>    "SanlockInitializationError: Failed to initialize sanlock, the
>>> number of errors has exceeded the limit"
>>>
>>>
>>>
>>> On Thu, Jun 29, 2017 at 5:10 PM, cmc <iuco...@gmail.com> wrote:
>>>> Sorry, I am mistaken, two hosts failed for the agent with the following 
>>>> error:
>>>>
>>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>>> ERROR Failed to start monitoring domain
>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>> during domain acquisition
>>>> ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine
>>>> ERROR Shutting down the agent because of 3 failures in a row!
>>>>
>>>> What could cause these timeouts? Some other service not running?
>>>>
>>>> On Thu, Jun 29, 2017 at 5:03 PM, cmc <iuco...@gmail.com> wrote:
>>>>> Both services are up on all three hosts. The broke logs just report:
>>>>>
>>>>> Thread-6549::INFO::2017-06-29
>>>>> 17:01:51,481::listener::134::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(setup)
>>>>> Connection established
>>>>> Thread-6549::INFO::2017-06-29
>>>>> 17:01:51,483::listener::186::ovirt_hosted_engine_ha.broker.listener.ConnectionHandler::(handle)
>>>>> Connection closed
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Cam
>>>>>
>>>>> On Thu, Jun 29, 2017 at 4:00 PM, Martin Sivak <msi...@redhat.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> please make sure that both ovirt-ha-agent and ovirt-ha-broker services
>>>>>> are restarted and up. The error says the agent can't talk to the
>>>>>> broker. Is there anything in the broker.log?
>>>>>>
>>>>>> Best regards
>>>>>>
>>>>>> Martin Sivak
>>>>>>
>>>>>> On Thu, Jun 29, 2017 at 4:42 PM, cmc <iuco...@gmail.com> wrote:
>>>>>>> I've restarted those two services across all hosts, have taken the
>>>>>>> Hosted Engine host out of maintenance, and when I try to migrate the
>>>>>>> Hosted Engine over to another host, it reports that all three hosts
>>>>>>> 'did not satisfy internal filter HA because it is not a Hosted Engine
>>>>>>> host'.
>>>>>>>
>>>>>>> On the host that the Hosted Engine is currently on it reports in the 
>>>>>>> agent.log:
>>>>>>>
>>>>>>> ovirt-ha-agent ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR
>>>>>>> Connection closed: Connection closed
>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>>>>>> ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink ERROR Exception
>>>>>>> getting service path: Connection closed
>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent
>>>>>>> call last):
>>>>>>>                                                     File
>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>>> line 191, in _run_agent
>>>>>>>                                                       return action(he)
>>>>>>>                                                     File
>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
>>>>>>> line 64, in action_proper
>>>>>>>                                                       return
>>>>>>> he.start_monitoring()
>>>>>>>                                                     File
>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>> line 411, in start_monitoring
>>>>>>>                                                       
>>>>>>> self._initialize_sanlock()
>>>>>>>                                                     File
>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>> line 691, in _initialize_sanlock
>>>>>>>
>>>>>>> constants.SERVICE_TYPE + constants.LOCKSPACE_EXTENSION)
>>>>>>>                                                     File
>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
>>>>>>> line 162, in get_service_path
>>>>>>>                                                       .format(str(e)))
>>>>>>>                                                   RequestError: Failed
>>>>>>> to get service path: Connection closed
>>>>>>> Jun 29 15:22:25 kvm-ldn-03 ovirt-ha-agent[12653]: ovirt-ha-agent
>>>>>>> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
>>>>>>>
>>>>>>> On Thu, Jun 29, 2017 at 1:25 PM, Martin Sivak <msi...@redhat.com> wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> yep, you have to restart the ovirt-ha-agent and ovirt-ha-broker 
>>>>>>>> services.
>>>>>>>>
>>>>>>>> The scheduling message just means that the host has score 0 or is not
>>>>>>>> reporting score at all.
>>>>>>>>
>>>>>>>> Martin
>>>>>>>>
>>>>>>>> On Thu, Jun 29, 2017 at 1:33 PM, cmc <iuco...@gmail.com> wrote:
>>>>>>>>> Thanks Martin, do I have to restart anything? When I try to use the
>>>>>>>>> 'migrate' operation, it complains that the other two hosts 'did not
>>>>>>>>> satisfy internal filter HA because it is not a Hosted Engine host..'
>>>>>>>>> (even though I reinstalled both these hosts with the 'deploy hosted
>>>>>>>>> engine' option, which suggests that something needs restarting. Should
>>>>>>>>> I worry about the sanlock errors, or will that be resolved by the
>>>>>>>>> change in host_id?
>>>>>>>>>
>>>>>>>>> Kind regards,
>>>>>>>>>
>>>>>>>>> Cam
>>>>>>>>>
>>>>>>>>> On Thu, Jun 29, 2017 at 12:22 PM, Martin Sivak <msi...@redhat.com> 
>>>>>>>>> wrote:
>>>>>>>>>> Change the ids so they are distinct. I need to check if there is a 
>>>>>>>>>> way
>>>>>>>>>> to read the SPM ids from the engine as using the same numbers would 
>>>>>>>>>> be
>>>>>>>>>> the best.
>>>>>>>>>>
>>>>>>>>>> Martin
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Jun 29, 2017 at 12:46 PM, cmc <iuco...@gmail.com> wrote:
>>>>>>>>>>> Is there any way of recovering from this situation? I'd prefer to 
>>>>>>>>>>> fix
>>>>>>>>>>> the issue rather than re-deploy, but if there is no recovery path, I
>>>>>>>>>>> could perhaps try re-deploying the hosted engine. In which case, 
>>>>>>>>>>> would
>>>>>>>>>>> the best option be to take a backup of the Hosted Engine, and then
>>>>>>>>>>> shut it down, re-initialise the SAN partition (or use another
>>>>>>>>>>> partition) and retry the deployment? Would it be better to use the
>>>>>>>>>>> older backup from the bare metal engine that I originally used, or 
>>>>>>>>>>> use
>>>>>>>>>>> a backup from the Hosted Engine? I'm not sure if any VMs have been
>>>>>>>>>>> added since switching to Hosted Engine.
>>>>>>>>>>>
>>>>>>>>>>> Unfortunately I have very little time left to get this working 
>>>>>>>>>>> before
>>>>>>>>>>> I have to hand it over for eval (by end of Friday).
>>>>>>>>>>>
>>>>>>>>>>> Here are some log snippets from the cluster that are current
>>>>>>>>>>>
>>>>>>>>>>> In /var/log/vdsm/vdsm.log on the host that has the Hosted Engine:
>>>>>>>>>>>
>>>>>>>>>>> 2017-06-29 10:50:15,071+0100 INFO  (monitor/207221b) 
>>>>>>>>>>> [storage.SANLock]
>>>>>>>>>>> Acquiring host id for domain 207221b2-959b-426b-b945-18e1adfed62f 
>>>>>>>>>>> (id:
>>>>>>>>>>> 3) (clusterlock:282)
>>>>>>>>>>> 2017-06-29 10:50:15,072+0100 ERROR (monitor/207221b) 
>>>>>>>>>>> [storage.Monitor]
>>>>>>>>>>> Error acquiring host id 3 for domain
>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f (monitor:558)
>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>   File "/usr/share/vdsm/storage/monitor.py", line 555, in 
>>>>>>>>>>> _acquireHostId
>>>>>>>>>>>     self.domain.acquireHostId(self.hostId, async=True)
>>>>>>>>>>>   File "/usr/share/vdsm/storage/sd.py", line 790, in acquireHostId
>>>>>>>>>>>     self._manifest.acquireHostId(hostId, async)
>>>>>>>>>>>   File "/usr/share/vdsm/storage/sd.py", line 449, in acquireHostId
>>>>>>>>>>>     self._domainLock.acquireHostId(hostId, async)
>>>>>>>>>>>   File 
>>>>>>>>>>> "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py",
>>>>>>>>>>> line 297, in acquireHostId
>>>>>>>>>>>     raise se.AcquireHostIdFailure(self._sdUUID, e)
>>>>>>>>>>> AcquireHostIdFailure: Cannot acquire host id:
>>>>>>>>>>> ('207221b2-959b-426b-b945-18e1adfed62f', SanlockException(22, 
>>>>>>>>>>> 'Sanlock
>>>>>>>>>>> lockspace add failure', 'Invalid argument'))
>>>>>>>>>>>
>>>>>>>>>>> From /var/log/ovirt-hosted-engine-ha/agent.log on the same host:
>>>>>>>>>>>
>>>>>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>> 13:30:50,592::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>>>>>>>> Failed to start monitoring domain
>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>> during domain acquisition
>>>>>>>>>>> MainThread::WARNING::2017-06-19
>>>>>>>>>>> 13:30:50,593::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>> Error while monitoring engine: Failed to start monitoring domain
>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>> during domain acquisition
>>>>>>>>>>> MainThread::WARNING::2017-06-19
>>>>>>>>>>> 13:30:50,593::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>> Unexpected error
>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>   File 
>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>> line 443, in start_monitoring
>>>>>>>>>>>     self._initialize_domain_monitor()
>>>>>>>>>>>   File 
>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>> line 823, in _initialize_domain_monitor
>>>>>>>>>>>     raise Exception(msg)
>>>>>>>>>>> Exception: Failed to start monitoring domain
>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>> during domain acquisition
>>>>>>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>> 13:30:50,593::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>> Shutting down the agent because of 3 failures in a row!
>>>>>>>>>>>
>>>>>>>>>>> From sanlock.log:
>>>>>>>>>>>
>>>>>>>>>>> 2017-06-29 11:17:06+0100 1194149 [2530]: add_lockspace
>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f:3:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>> conflicts with name of list1 s5
>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>
>>>>>>>>>>> From the two other hosts:
>>>>>>>>>>>
>>>>>>>>>>> host 2:
>>>>>>>>>>>
>>>>>>>>>>> vdsm.log
>>>>>>>>>>>
>>>>>>>>>>> 2017-06-29 10:53:47,755+0100 ERROR (jsonrpc/4) 
>>>>>>>>>>> [jsonrpc.JsonRpcServer]
>>>>>>>>>>> Internal server error (__init__:570)
>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>   File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", 
>>>>>>>>>>> line
>>>>>>>>>>> 565, in _handle_request
>>>>>>>>>>>     res = method(**params)
>>>>>>>>>>>   File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line
>>>>>>>>>>> 202, in _dynamicMethod
>>>>>>>>>>>     result = fn(*methodArgs)
>>>>>>>>>>>   File "/usr/share/vdsm/API.py", line 1454, in 
>>>>>>>>>>> getAllVmIoTunePolicies
>>>>>>>>>>>     io_tune_policies_dict = self._cif.getAllVmIoTunePolicies()
>>>>>>>>>>>   File "/usr/share/vdsm/clientIF.py", line 448, in 
>>>>>>>>>>> getAllVmIoTunePolicies
>>>>>>>>>>>     'current_values': v.getIoTune()}
>>>>>>>>>>>   File "/usr/share/vdsm/virt/vm.py", line 2803, in getIoTune
>>>>>>>>>>>     result = self.getIoTuneResponse()
>>>>>>>>>>>   File "/usr/share/vdsm/virt/vm.py", line 2816, in getIoTuneResponse
>>>>>>>>>>>     res = self._dom.blockIoTune(
>>>>>>>>>>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", 
>>>>>>>>>>> line
>>>>>>>>>>> 47, in __getattr__
>>>>>>>>>>>     % self.vmid)
>>>>>>>>>>> NotConnectedError: VM u'a79e6b0e-fff4-4cba-a02c-4c00be151300' was 
>>>>>>>>>>> not
>>>>>>>>>>> started yet or was shut down
>>>>>>>>>>>
>>>>>>>>>>> /var/log/ovirt-hosted-engine-ha/agent.log
>>>>>>>>>>>
>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>> 10:56:33,636::ovf_store::103::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(scan)
>>>>>>>>>>> Found OVF_STORE: imgUUID:222610db-7880-4f4f-8559-a3635fd73555,
>>>>>>>>>>> volUUID:c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>> 10:56:33,926::ovf_store::112::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>> Extracting Engine VM OVF from the OVF_STORE
>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>> 10:56:33,938::ovf_store::119::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>> OVF_STORE volume path:
>>>>>>>>>>> /rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/images/222610db-7880-4f4f-8559-a3635fd73555/c6e0d29b-eabf-4a09-a330-df54cfdd73f1
>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>> 10:56:33,967::config::431::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>> Found an OVF for HE VM, trying to convert
>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>> 10:56:33,971::config::436::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>> Got vm.conf from OVF_STORE
>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>> 10:56:36,736::states::678::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
>>>>>>>>>>> Score is 0 due to unexpected vm shutdown at Thu Jun 29 10:53:59 2017
>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>> 10:56:36,736::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>> Current state EngineUnexpectedlyDown (score: 0)
>>>>>>>>>>> MainThread::INFO::2017-06-29
>>>>>>>>>>> 10:56:46,772::config::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(refresh_vm_conf)
>>>>>>>>>>> Reloading vm.conf from the shared storage domain
>>>>>>>>>>>
>>>>>>>>>>> /var/log/messages:
>>>>>>>>>>>
>>>>>>>>>>> Jun 29 10:53:46 kvm-ldn-02 kernel: dd: sending ioctl 80306d02 to a 
>>>>>>>>>>> partition!
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> host 1:
>>>>>>>>>>>
>>>>>>>>>>> /var/log/messages also in sanlock.log
>>>>>>>>>>>
>>>>>>>>>>> Jun 29 11:01:02 kvm-ldn-01 sanlock[2400]: 2017-06-29 11:01:02+0100
>>>>>>>>>>> 678325 [9132]: s4531 delta_acquire host_id 1 busy1 1 2 1193177
>>>>>>>>>>> 3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>>>>>>>> Jun 29 11:01:03 kvm-ldn-01 sanlock[2400]: 2017-06-29 11:01:03+0100
>>>>>>>>>>> 678326 [24159]: s4531 add_lockspace fail result -262
>>>>>>>>>>>
>>>>>>>>>>> /var/log/ovirt-hosted-engine-ha/agent.log:
>>>>>>>>>>>
>>>>>>>>>>> MainThread::ERROR::2017-06-27
>>>>>>>>>>> 15:21:01,143::hosted_engine::822::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_domain_monitor)
>>>>>>>>>>> Failed to start monitoring domain
>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>> during domain acquisition
>>>>>>>>>>> MainThread::WARNING::2017-06-27
>>>>>>>>>>> 15:21:01,144::hosted_engine::469::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>> Error while monitoring engine: Failed to start monitoring domain
>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>> during domain acquisition
>>>>>>>>>>> MainThread::WARNING::2017-06-27
>>>>>>>>>>> 15:21:01,144::hosted_engine::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>> Unexpected error
>>>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>>>   File 
>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>> line 443, in start_monitoring
>>>>>>>>>>>     self._initialize_domain_monitor()
>>>>>>>>>>>   File 
>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
>>>>>>>>>>> line 823, in _initialize_domain_monitor
>>>>>>>>>>>     raise Exception(msg)
>>>>>>>>>>> Exception: Failed to start monitoring domain
>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f, host_id=1): timeout
>>>>>>>>>>> during domain acquisition
>>>>>>>>>>> MainThread::ERROR::2017-06-27
>>>>>>>>>>> 15:21:01,144::hosted_engine::485::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
>>>>>>>>>>> Shutting down the agent because of 3 failures in a row!
>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>> 15:21:06,717::hosted_engine::848::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_domain_monitor_status)
>>>>>>>>>>> VDSM domain monitor status: PENDING
>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>> 15:21:09,335::hosted_engine::776::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_domain_monitor)
>>>>>>>>>>> Failed to stop monitoring domain
>>>>>>>>>>> (sd_uuid=207221b2-959b-426b-b945-18e1adfed62f): Storage domain is
>>>>>>>>>>> member of pool: u'domain=207221b2-959b-426b-b945-18e1adfed62f'
>>>>>>>>>>> MainThread::INFO::2017-06-27
>>>>>>>>>>> 15:21:09,339::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
>>>>>>>>>>> Agent shutting down
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Cam
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Jun 28, 2017 at 11:25 AM, cmc <iuco...@gmail.com> wrote:
>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>
>>>>>>>>>>>> yes, on two of the machines they have the same host_id. The other 
>>>>>>>>>>>> has
>>>>>>>>>>>> a different host_id.
>>>>>>>>>>>>
>>>>>>>>>>>> To update since yesterday: I reinstalled and deployed Hosted 
>>>>>>>>>>>> Engine on
>>>>>>>>>>>> the other host (so all three hosts in the cluster now have it
>>>>>>>>>>>> installed). The second one I deployed said it was able to host the
>>>>>>>>>>>> engine (unlike the first I reinstalled), so I tried putting the 
>>>>>>>>>>>> host
>>>>>>>>>>>> with the Hosted Engine on it into maintenance to see if it would
>>>>>>>>>>>> migrate over. It managed to move all hosts but the Hosted Engine. 
>>>>>>>>>>>> And
>>>>>>>>>>>> now the host that said it was able to host the engine says
>>>>>>>>>>>> 'unavailable due to HA score'. The host that it was trying to move
>>>>>>>>>>>> from is now in 'preparing for maintenance' for the last 12 hours.
>>>>>>>>>>>>
>>>>>>>>>>>> The summary is:
>>>>>>>>>>>>
>>>>>>>>>>>> kvm-ldn-01 - one of the original, pre-Hosted Engine hosts, 
>>>>>>>>>>>> reinstalled
>>>>>>>>>>>> with 'Deploy Hosted Engine'. No icon saying it can host the Hosted
>>>>>>>>>>>> Hngine, host_id of '2' in 
>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf.
>>>>>>>>>>>> 'add_lockspace' fails in sanlock.log
>>>>>>>>>>>>
>>>>>>>>>>>> kvm-ldn-02 - the other host that was pre-existing before Hosted 
>>>>>>>>>>>> Engine
>>>>>>>>>>>> was created. Reinstalled with 'Deploy Hosted Engine'. Had an icon
>>>>>>>>>>>> saying that it was able to host the Hosted Engine, but after 
>>>>>>>>>>>> migration
>>>>>>>>>>>> was attempted when putting kvm-ldn-03 into maintenance, it reports:
>>>>>>>>>>>> 'unavailable due to HA score'. It has a host_id of '1' in
>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf. No errors in 
>>>>>>>>>>>> sanlock.log
>>>>>>>>>>>>
>>>>>>>>>>>> kvm-ldn-03 - this was the host I deployed Hosted Engine on, which 
>>>>>>>>>>>> was
>>>>>>>>>>>> not part of the original cluster. I restored the bare-metal engine
>>>>>>>>>>>> backup in the Hosted Engine on this host when deploying it, without
>>>>>>>>>>>> error. It currently has the Hosted Engine on it (as the only VM 
>>>>>>>>>>>> after
>>>>>>>>>>>> I put that host into maintenance to test the HA of Hosted Engine).
>>>>>>>>>>>> Sanlock log shows conflicts
>>>>>>>>>>>>
>>>>>>>>>>>> I will look through all the logs for any other errors. Please let 
>>>>>>>>>>>> me
>>>>>>>>>>>> know if you need any logs or other clarification/information.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> Campbell
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Jun 28, 2017 at 9:25 AM, Martin Sivak <msi...@redhat.com> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> can you please check the contents of
>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf or
>>>>>>>>>>>>> /etc/ovirt-hosted-engine-ha/agent.conf (I am not sure which one 
>>>>>>>>>>>>> it is
>>>>>>>>>>>>> right now) and search for host-id?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Make sure the IDs are different. If they are not, then there is a 
>>>>>>>>>>>>> bug somewhere.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Martin
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 6:26 PM, cmc <iuco...@gmail.com> wrote:
>>>>>>>>>>>>>> I see this on the host it is trying to migrate in 
>>>>>>>>>>>>>> /var/log/sanlock:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2017-06-27 17:10:40+0100 527703 [2407]: s3528 lockspace
>>>>>>>>>>>>>> 207221b2-959b-426b-b945-18e1adfed62f:1:/dev/207221b2-959b-426b-b945-18e1adfed62f/ids:0
>>>>>>>>>>>>>> 2017-06-27 17:13:00+0100 527843 [27446]: s3528 delta_acquire 
>>>>>>>>>>>>>> host_id 1
>>>>>>>>>>>>>> busy1 1 2 1042692 3d4ec963-8486-43a2-a7d9-afa82508f89f.kvm-ldn-03
>>>>>>>>>>>>>> 2017-06-27 17:13:01+0100 527844 [2407]: s3528 add_lockspace fail 
>>>>>>>>>>>>>> result -262
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The sanlock service is running. Why would this occur?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> C
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 5:21 PM, cmc <iuco...@gmail.com> wrote:
>>>>>>>>>>>>>>> Hi Martin,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for the reply. I have done this, and the deployment 
>>>>>>>>>>>>>>> completed
>>>>>>>>>>>>>>> without error. However, it still will not allow the Hosted 
>>>>>>>>>>>>>>> Engine
>>>>>>>>>>>>>>> migrate to another host. The
>>>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf got created ok on 
>>>>>>>>>>>>>>> the host
>>>>>>>>>>>>>>> I re-installed, but the ovirt-ha-broker.service, though it 
>>>>>>>>>>>>>>> starts,
>>>>>>>>>>>>>>> reports:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --------------------8<-------------------
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jun 27 14:58:26 kvm-ldn-01 systemd[1]: Starting oVirt Hosted 
>>>>>>>>>>>>>>> Engine
>>>>>>>>>>>>>>> High Availability Communications Broker...
>>>>>>>>>>>>>>> Jun 27 14:58:27 kvm-ldn-01 ovirt-ha-broker[6101]: 
>>>>>>>>>>>>>>> ovirt-ha-broker
>>>>>>>>>>>>>>> ovirt_hosted_engine_ha.broker.storage_broker.StorageBroker ERROR
>>>>>>>>>>>>>>> Failed to read metadata from
>>>>>>>>>>>>>>> /rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata
>>>>>>>>>>>>>>>                                                   Traceback 
>>>>>>>>>>>>>>> (most
>>>>>>>>>>>>>>> recent call last):
>>>>>>>>>>>>>>>                                                     File
>>>>>>>>>>>>>>> "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
>>>>>>>>>>>>>>> line 129, in get_raw_stats_for_service_type
>>>>>>>>>>>>>>>                                                       f =
>>>>>>>>>>>>>>> os.open(path, direct_flag | os.O_RDONLY | os.O_SYNC)
>>>>>>>>>>>>>>>                                                   OSError: 
>>>>>>>>>>>>>>> [Errno 2]
>>>>>>>>>>>>>>> No such file or directory:
>>>>>>>>>>>>>>> '/rhev/data-center/mnt/blockSD/207221b2-959b-426b-b945-18e1adfed62f/ha_agent/hosted-engine.metadata'
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --------------------8<-------------------
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I checked the path, and it exists. I can run 'less -f' on it 
>>>>>>>>>>>>>>> fine. The
>>>>>>>>>>>>>>> perms are slightly different on the host that is running the VM 
>>>>>>>>>>>>>>> vs the
>>>>>>>>>>>>>>> one that is reporting errors (600 vs 660), ownership is 
>>>>>>>>>>>>>>> vdsm:qemu. Is
>>>>>>>>>>>>>>> this a san locking issue?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 1:41 PM, Martin Sivak 
>>>>>>>>>>>>>>> <msi...@redhat.com> wrote:
>>>>>>>>>>>>>>>>> Should it be? It was not in the instructions for the 
>>>>>>>>>>>>>>>>> migration from
>>>>>>>>>>>>>>>>> bare-metal to Hosted VM
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The hosted engine will only migrate to hosts that have the 
>>>>>>>>>>>>>>>> services
>>>>>>>>>>>>>>>> running. Please put one other host to maintenance and select 
>>>>>>>>>>>>>>>> Hosted
>>>>>>>>>>>>>>>> engine action: DEPLOY in the reinstall dialog.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best regards
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Martin Sivak
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Jun 27, 2017 at 1:23 PM, cmc <iuco...@gmail.com> wrote:
>>>>>>>>>>>>>>>>> I changed the 'os.other.devices.display.protocols.value.3.6 =
>>>>>>>>>>>>>>>>> spice/qxl,vnc/cirrus,vnc/qxl' line to have the same display 
>>>>>>>>>>>>>>>>> protocols
>>>>>>>>>>>>>>>>> as 4 and the hosted engine now appears in the list of VMs. I 
>>>>>>>>>>>>>>>>> am
>>>>>>>>>>>>>>>>> guessing the compatibility version was causing it to use the 
>>>>>>>>>>>>>>>>> 3.6
>>>>>>>>>>>>>>>>> version. However, I am still unable to migrate the engine VM 
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> another host. When I try putting the host it is currently on 
>>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>> maintenance, it reports:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Error while executing action: Cannot switch the Host(s) to 
>>>>>>>>>>>>>>>>> Maintenance mode.
>>>>>>>>>>>>>>>>> There are no available hosts capable of running the engine VM.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Running 'hosted-engine --vm-status' still shows 'Engine 
>>>>>>>>>>>>>>>>> status:
>>>>>>>>>>>>>>>>> unknown stale-data'.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The ovirt-ha-broker service is only running on one host. It 
>>>>>>>>>>>>>>>>> was set to
>>>>>>>>>>>>>>>>> 'disabled' in systemd. It won't start as there is no
>>>>>>>>>>>>>>>>> /etc/ovirt-hosted-engine/hosted-engine.conf on the other two 
>>>>>>>>>>>>>>>>> hosts.
>>>>>>>>>>>>>>>>> Should it be? It was not in the instructions for the 
>>>>>>>>>>>>>>>>> migration from
>>>>>>>>>>>>>>>>> bare-metal to Hosted VM
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 1:07 PM, cmc <iuco...@gmail.com> 
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> Hi Tomas,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> So in my 
>>>>>>>>>>>>>>>>>> /usr/share/ovirt-engine/conf/osinfo-defaults.properties on my
>>>>>>>>>>>>>>>>>> engine VM, I have:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> os.other.devices.display.protocols.value = 
>>>>>>>>>>>>>>>>>> spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>>>>>>> os.other.devices.display.protocols.value.3.6 = 
>>>>>>>>>>>>>>>>>> spice/qxl,vnc/cirrus,vnc/qxl
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> That seems to match - I assume since this is 4.1, the 3.6 
>>>>>>>>>>>>>>>>>> should not apply
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is there somewhere else I should be looking?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 11:40 AM, Tomas Jelinek 
>>>>>>>>>>>>>>>>>> <tjeli...@redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Jun 22, 2017 at 12:38 PM, Michal Skrivanek
>>>>>>>>>>>>>>>>>>> <michal.skriva...@redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> > On 22 Jun 2017, at 12:31, Martin Sivak 
>>>>>>>>>>>>>>>>>>>> > <msi...@redhat.com> wrote:
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > Tomas, what fields are needed in a VM to pass the check 
>>>>>>>>>>>>>>>>>>>> > that causes
>>>>>>>>>>>>>>>>>>>> > the following error?
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> >>>>> WARN  
>>>>>>>>>>>>>>>>>>>> >>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>> >>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of 
>>>>>>>>>>>>>>>>>>>> >>>>> action
>>>>>>>>>>>>>>>>>>>> >>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>> >>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>> >>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> to match the OS and VM Display type;-)
>>>>>>>>>>>>>>>>>>>> Configuration is in osinfo….e.g. if that is import from 
>>>>>>>>>>>>>>>>>>>> older releases on
>>>>>>>>>>>>>>>>>>>> Linux this is typically caused by the cahgen of cirrus to 
>>>>>>>>>>>>>>>>>>>> vga for non-SPICE
>>>>>>>>>>>>>>>>>>>> VMs
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> yep, the default supported combinations for 4.0+ is this:
>>>>>>>>>>>>>>>>>>> os.other.devices.display.protocols.value =
>>>>>>>>>>>>>>>>>>> spice/qxl,vnc/vga,vnc/qxl,vnc/cirrus
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > Thanks.
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > On Thu, Jun 22, 2017 at 12:19 PM, cmc 
>>>>>>>>>>>>>>>>>>>> > <iuco...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>> >> Hi Martin,
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>>> >>> just as a random comment, do you still have the 
>>>>>>>>>>>>>>>>>>>> >>> database backup from
>>>>>>>>>>>>>>>>>>>> >>> the bare metal -> VM attempt? It might be possible to 
>>>>>>>>>>>>>>>>>>>> >>> just try again
>>>>>>>>>>>>>>>>>>>> >>> using it. Or in the worst case.. update the offending 
>>>>>>>>>>>>>>>>>>>> >>> value there
>>>>>>>>>>>>>>>>>>>> >>> before restoring it to the new engine instance.
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> I still have the backup. I'd rather do the latter, as 
>>>>>>>>>>>>>>>>>>>> >> re-running the
>>>>>>>>>>>>>>>>>>>> >> HE deployment is quite lengthy and involved (I have to 
>>>>>>>>>>>>>>>>>>>> >> re-initialise
>>>>>>>>>>>>>>>>>>>> >> the FC storage each time). Do you know what the 
>>>>>>>>>>>>>>>>>>>> >> offending value(s)
>>>>>>>>>>>>>>>>>>>> >> would be? Would it be in the Postgres DB or in a config 
>>>>>>>>>>>>>>>>>>>> >> file
>>>>>>>>>>>>>>>>>>>> >> somewhere?
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> Cheers,
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> Cam
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >>> Regards
>>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>>> >>> Martin Sivak
>>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>>> >>> On Thu, Jun 22, 2017 at 11:39 AM, cmc 
>>>>>>>>>>>>>>>>>>>> >>> <iuco...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>> >>>> Hi Yanir,
>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>> >>>> Thanks for the reply.
>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>> >>>>> First of all, maybe a chain reaction of :
>>>>>>>>>>>>>>>>>>>> >>>>> WARN  
>>>>>>>>>>>>>>>>>>>> >>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>> >>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation of 
>>>>>>>>>>>>>>>>>>>> >>>>> action
>>>>>>>>>>>>>>>>>>>> >>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>> >>>>> failed for user SYSTEM. Reasons: VAR__ACTION__IMPORT
>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>> >>>>> ,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>> >>>>> is causing the hosted engine vm not to be set up 
>>>>>>>>>>>>>>>>>>>> >>>>> correctly  and
>>>>>>>>>>>>>>>>>>>> >>>>> further
>>>>>>>>>>>>>>>>>>>> >>>>> actions were made when the hosted engine vm wasnt in 
>>>>>>>>>>>>>>>>>>>> >>>>> a stable state.
>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>> >>>>> As for now, are you trying to revert back to a 
>>>>>>>>>>>>>>>>>>>> >>>>> previous/initial
>>>>>>>>>>>>>>>>>>>> >>>>> state ?
>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>> >>>> I'm not trying to revert it to a previous state for 
>>>>>>>>>>>>>>>>>>>> >>>> now. This was a
>>>>>>>>>>>>>>>>>>>> >>>> migration from a bare metal engine, and it didn't 
>>>>>>>>>>>>>>>>>>>> >>>> report any error
>>>>>>>>>>>>>>>>>>>> >>>> during the migration. I'd had some problems on my 
>>>>>>>>>>>>>>>>>>>> >>>> first attempts at
>>>>>>>>>>>>>>>>>>>> >>>> this migration, whereby it never completed (due to a 
>>>>>>>>>>>>>>>>>>>> >>>> proxy issue) but
>>>>>>>>>>>>>>>>>>>> >>>> I managed to resolve this. Do you know of a way to 
>>>>>>>>>>>>>>>>>>>> >>>> get the Hosted
>>>>>>>>>>>>>>>>>>>> >>>> Engine VM into a stable state, without rebuilding the 
>>>>>>>>>>>>>>>>>>>> >>>> entire cluster
>>>>>>>>>>>>>>>>>>>> >>>> from scratch (since I have a lot of VMs on it)?
>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>> >>>> Thanks for any help.
>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>> >>>> Regards,
>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>> >>>> Cam
>>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>>> >>>>> Regards,
>>>>>>>>>>>>>>>>>>>> >>>>> Yanir
>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>> >>>>> On Wed, Jun 21, 2017 at 4:32 PM, cmc 
>>>>>>>>>>>>>>>>>>>> >>>>> <iuco...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>> Hi Jenny/Martin,
>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>> Any idea what I can do here? The hosted engine VM 
>>>>>>>>>>>>>>>>>>>> >>>>>> has no log on any
>>>>>>>>>>>>>>>>>>>> >>>>>> host in /var/log/libvirt/qemu, and I fear that if I 
>>>>>>>>>>>>>>>>>>>> >>>>>> need to put the
>>>>>>>>>>>>>>>>>>>> >>>>>> host into maintenance, e.g., to upgrade it that I 
>>>>>>>>>>>>>>>>>>>> >>>>>> created it on
>>>>>>>>>>>>>>>>>>>> >>>>>> (which
>>>>>>>>>>>>>>>>>>>> >>>>>> I think is hosting it), or if it fails for any 
>>>>>>>>>>>>>>>>>>>> >>>>>> reason, it won't get
>>>>>>>>>>>>>>>>>>>> >>>>>> migrated to another host, and I will not be able to 
>>>>>>>>>>>>>>>>>>>> >>>>>> manage the
>>>>>>>>>>>>>>>>>>>> >>>>>> cluster. It seems to be a very dangerous position 
>>>>>>>>>>>>>>>>>>>> >>>>>> to be in.
>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>> Cam
>>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>> On Wed, Jun 21, 2017 at 11:48 AM, cmc 
>>>>>>>>>>>>>>>>>>>> >>>>>> <iuco...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>> >>>>>>> Thanks Martin. The hosts are all part of the same 
>>>>>>>>>>>>>>>>>>>> >>>>>>> cluster.
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> I get these errors in the engine.log on the engine:
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> 2017-06-19 03:28:05,030Z WARN
>>>>>>>>>>>>>>>>>>>> >>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>> >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Validation 
>>>>>>>>>>>>>>>>>>>> >>>>>>> of action
>>>>>>>>>>>>>>>>>>>> >>>>>>> 'ImportVm'
>>>>>>>>>>>>>>>>>>>> >>>>>>> failed for user SYST
>>>>>>>>>>>>>>>>>>>> >>>>>>> EM. Reasons:
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> VAR__ACTION__IMPORT,VAR__TYPE__VM,ACTION_TYPE_FAILED_ILLEGAL_VM_DISPLAY_TYPE_IS_NOT_SUPPORTED_BY_OS
>>>>>>>>>>>>>>>>>>>> >>>>>>> 2017-06-19 03:28:05,030Z INFO
>>>>>>>>>>>>>>>>>>>> >>>>>>> [org.ovirt.engine.core.bll.exportimport.ImportVmCommand]
>>>>>>>>>>>>>>>>>>>> >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Lock freed 
>>>>>>>>>>>>>>>>>>>> >>>>>>> to object
>>>>>>>>>>>>>>>>>>>> >>>>>>> 'EngineLock:{exclusiveLocks='[a
>>>>>>>>>>>>>>>>>>>> >>>>>>> 79e6b0e-fff4-4cba-a02c-4c00be151300=<VM,
>>>>>>>>>>>>>>>>>>>> >>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName 
>>>>>>>>>>>>>>>>>>>> >>>>>>> HostedEngine>,
>>>>>>>>>>>>>>>>>>>> >>>>>>> HostedEngine=<VM_NAME, 
>>>>>>>>>>>>>>>>>>>> >>>>>>> ACTION_TYPE_FAILED_NAME_ALREADY_USED>]',
>>>>>>>>>>>>>>>>>>>> >>>>>>> sharedLocks=
>>>>>>>>>>>>>>>>>>>> >>>>>>> '[a79e6b0e-fff4-4cba-a02c-4c00be151300=<REMOTE_VM,
>>>>>>>>>>>>>>>>>>>> >>>>>>> ACTION_TYPE_FAILED_VM_IS_BEING_IMPORTED$VmName 
>>>>>>>>>>>>>>>>>>>> >>>>>>> HostedEngine>]'}'
>>>>>>>>>>>>>>>>>>>> >>>>>>> 2017-06-19 03:28:05,030Z ERROR
>>>>>>>>>>>>>>>>>>>> >>>>>>> [org.ovirt.engine.core.bll.HostedEngineImporter]
>>>>>>>>>>>>>>>>>>>> >>>>>>> (org.ovirt.thread.pool-6-thread-23) [] Failed 
>>>>>>>>>>>>>>>>>>>> >>>>>>> importing the Hosted
>>>>>>>>>>>>>>>>>>>> >>>>>>> Engine VM
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> The sanlock.log reports conflicts on that same 
>>>>>>>>>>>>>>>>>>>> >>>>>>> host, and a
>>>>>>>>>>>>>>>>>>>> >>>>>>> different
>>>>>>>>>>>>>>>>>>>> >>>>>>> error on the other hosts, not sure if they are 
>>>>>>>>>>>>>>>>>>>> >>>>>>> related.
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> And this in the 
>>>>>>>>>>>>>>>>>>>> >>>>>>> /var/log/ovirt-hosted-engine-ha/agent log on the
>>>>>>>>>>>>>>>>>>>> >>>>>>> host
>>>>>>>>>>>>>>>>>>>> >>>>>>> which I deployed the hosted engine VM on:
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> 13:09:49,743::ovf_store::124::ovirt_hosted_engine_ha.lib.ovf.ovf_store.OVFStore::(getEngineVMOVF)
>>>>>>>>>>>>>>>>>>>> >>>>>>> Unable to extract HEVM OVF
>>>>>>>>>>>>>>>>>>>> >>>>>>> MainThread::ERROR::2017-06-19
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> 13:09:49,743::config::445::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config::(_get_vm_conf_content_from_ovf_store)
>>>>>>>>>>>>>>>>>>>> >>>>>>> Failed extracting VM OVF from the OVF_STORE 
>>>>>>>>>>>>>>>>>>>> >>>>>>> volume, falling back
>>>>>>>>>>>>>>>>>>>> >>>>>>> to
>>>>>>>>>>>>>>>>>>>> >>>>>>> initial vm.conf
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> I've seen some of these issues reported in 
>>>>>>>>>>>>>>>>>>>> >>>>>>> bugzilla, but they were
>>>>>>>>>>>>>>>>>>>> >>>>>>> for
>>>>>>>>>>>>>>>>>>>> >>>>>>> older versions of oVirt (and appear to be 
>>>>>>>>>>>>>>>>>>>> >>>>>>> resolved).
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> I will install that package on the other two 
>>>>>>>>>>>>>>>>>>>> >>>>>>> hosts, for which I
>>>>>>>>>>>>>>>>>>>> >>>>>>> will
>>>>>>>>>>>>>>>>>>>> >>>>>>> put them in maintenance as vdsm is installed as an 
>>>>>>>>>>>>>>>>>>>> >>>>>>> upgrade. I
>>>>>>>>>>>>>>>>>>>> >>>>>>> guess
>>>>>>>>>>>>>>>>>>>> >>>>>>> restarting vdsm is a good idea after that?
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> Campbell
>>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>> On Wed, Jun 21, 2017 at 10:51 AM, Martin Sivak 
>>>>>>>>>>>>>>>>>>>> >>>>>>> <msi...@redhat.com>
>>>>>>>>>>>>>>>>>>>> >>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> >>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>> you do not have to install it on all hosts. But 
>>>>>>>>>>>>>>>>>>>> >>>>>>>> you should have
>>>>>>>>>>>>>>>>>>>> >>>>>>>> more
>>>>>>>>>>>>>>>>>>>> >>>>>>>> than one and ideally all hosted engine enabled 
>>>>>>>>>>>>>>>>>>>> >>>>>>>> nodes should
>>>>>>>>>>>>>>>>>>>> >>>>>>>> belong to
>>>>>>>>>>>>>>>>>>>> >>>>>>>> the same engine cluster.
>>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>> Best regards
>>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>> Martin Sivak
>>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>> On Wed, Jun 21, 2017 at 11:29 AM, cmc 
>>>>>>>>>>>>>>>>>>>> >>>>>>>> <iuco...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Does ovirt-hosted-engine-ha need to be installed 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> across all
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> hosts?
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Could that be the reason it is failing to see it 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> properly?
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> On Mon, Jun 19, 2017 at 1:27 PM, cmc 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> <iuco...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Logs are attached. I can see errors in there, 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> but am unsure how
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> they
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> arose.
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Campbell
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> On Mon, Jun 19, 2017 at 12:29 PM, Evgenia Tokar
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> <eto...@redhat.com>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> From the output it looks like the agent is 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> down, try starting
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> it by
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> running:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> systemctl start ovirt-ha-agent.
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> The engine is supposed to see the hosted 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> engine storage domain
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> import it
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> to the system, then it should import the 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> hosted engine vm.
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Can you attach the agent log from the host
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> (/var/log/ovirt-hosted-engine-ha/agent.log)
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> and the engine log from the engine vm
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> (/var/log/ovirt-engine/engine.log)?
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Jenny
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> On Mon, Jun 19, 2017 at 12:41 PM, cmc 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> <iuco...@gmail.com>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Hi Jenny,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> What version are you running?
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 4.1.2.2-1.el7.centos
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> For the hosted engine vm to be imported and 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> displayed in the
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> engine, you
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> must first create a master storage domain.
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> To provide a bit more detail: this was a 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> migration of a
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> bare-metal
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> engine in an existing cluster to a hosted 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> engine VM for that
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> cluster.
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> As part of this migration, I built an 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> entirely new host and
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> ran
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 'hosted-engine --deploy' (followed these 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> instructions:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> http://www.ovirt.org/documentation/self-hosted/chap-Migrating_from_Bare_Metal_to_an_EL-Based_Self-Hosted_Environment/).
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> I restored the backup from the engine and it 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> completed
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> without any
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> errors. I didn't see any instructions 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> regarding a master
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> storage
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> domain in the page above. The cluster has two 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> existing master
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> storage
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> domains, one is fibre channel, which is up, 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> and one ISO
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> domain,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> is currently offline.
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> What do you mean the hosted engine commands 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> are failing?
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> What
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> happens
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> when
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> you run hosted-engine --vm-status now?
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Interestingly, whereas when I ran it before, 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> it exited with
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> output
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> and a return code of '1', it now reports:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> --== Host 1 status ==--
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> conf_on_shared_storage             : True
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Status up-to-date                  : False
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Hostname                           :
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> kvm-ldn-03.ldn.fscfc.co.uk
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Host ID                            : 1
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Engine status                      : unknown 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> stale-data
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Score                              : 0
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> stopped                            : True
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Local maintenance                  : False
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> crc32                              : 0217f07b
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> local_conf_timestamp               : 2911
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Host timestamp                     : 2897
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Extra metadata (valid at timestamp):
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        metadata_parse_version=1
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        metadata_feature_version=1
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        timestamp=2897 (Thu Jun 15 16:22:54 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 2017)
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        host-id=1
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        score=0
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        vm_conf_refresh_time=2911 (Thu Jun 15 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> 16:23:08 2017)
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        conf_on_shared_storage=True
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        maintenance=False
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        state=AgentStopped
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>        stopped=True
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Yet I can login to the web GUI fine. I guess 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> it is not HA due
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> being
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> in an unknown state currently? Does the 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> hosted-engine-ha rpm
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> need
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> be installed across all nodes in the cluster, 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> btw?
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Thanks for the help,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> Jenny Tokar
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> On Thu, Jun 15, 2017 at 6:32 PM, cmc 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> <iuco...@gmail.com>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> I've migrated from a bare-metal engine to a 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> hosted engine.
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> There
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> were
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> no errors during the install, however, the 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> hosted engine
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> did not
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> get
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> started. I tried running:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> hosted-engine --status
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> on the host I deployed it on, and it 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> returns nothing (exit
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> code
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> is 1
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> however). I could not ping it either. So I 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> tried starting
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> it via
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> 'hosted-engine --vm-start' and it returned:
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Virtual machine does not exist
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> But it then became available. I logged into 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> successfully. It
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> in the list of VMs however.
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Any ideas why the hosted-engine commands 
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> fail, and why it
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> is not
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> the list of virtual machines?
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for any help,
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Cam
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> Users@ovirt.org
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> Users@ovirt.org
>>>>>>>>>>>>>>>>>>>> >>>>>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>> >>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>>> >>>>>> Users mailing list
>>>>>>>>>>>>>>>>>>>> >>>>>> Users@ovirt.org
>>>>>>>>>>>>>>>>>>>> >>>>>> http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>>> > _______________________________________________
>>>>>>>>>>>>>>>>>>>> > Users mailing list
>>>>>>>>>>>>>>>>>>>> > Users@ovirt.org
>>>>>>>>>>>>>>>>>>>> > http://lists.ovirt.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] HostedEngine VM not visible, but running

Reply via email to