Robert, I understand the sentiment of the difficulty here. The recovery feels brutal but the monolithic nature and the dense ecosystem is understandable for the purpose it serves.
I am able to mount the raw disk image for the HostedEngine VM cleanly without any errors and it seems to check out, so I don't believe there is any corruption. Everything looks to operate as expected and then it just seems to snag somewhere through the startup. I suppose I'm just trying to trace down the hiccup to clear it out of the way and let the VM boot up. My knowledge is a bit limited digging in and troubleshooting the components here. Additional snippet: MainThread::INFO::2021-02-09 21:00:07,357::hosted_engine::863::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) stderr: Command VM.getStats with args {'vmID': '74b3c839-c89c-4857-ada0-95715672348a'} failed: (code=1, message=Virtual machine does not exist: {'vmId': '74b3c839-c89c-4857-ada0-95715672348a'}) MainThread::INFO::2021-02-09 21:00:07,357::hosted_engine::875::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) Engine VM started on localhost MainThread::INFO::2021-02-09 21:00:07,389::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) Success, was notification of state_transition (EngineStart-EngineStarting) sent? ignored MainThread::INFO::2021-02-09 21:00:07,406::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) MainThread::INFO::2021-02-09 21:00:17,427::states::740::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Another host already took over.. *Thank you,* *Ian Easter* On Tue, Feb 9, 2021 at 6:31 PM Robert Tongue <phuny...@neverserio.us> wrote: > I've seen this happen with the VM disk itself becoming corrupt. If you > try to read the contents of the file, and it gives you "Input/Output > Error", then it is not good news. I've been testing oVirt recently, and > these issues alone are preventing me from using it full time. I cannot > help further, unfortunately, as I have no idea how to fix it. So best I > can say is, hopefully someone else chimes in and helps both of us. > > -phunyguy > ------------------------------ > *From:* ieas...@telvue.com <ieas...@telvue.com> > *Sent:* Tuesday, February 9, 2021 6:25 PM > *To:* users@ovirt.org <users@ovirt.org> > *Subject:* [ovirt-users] Re: HostedEngine VM Paused after power failure > > Attempting to resume or start the VM doesn't yield any results. > > Here is the status of the VM: > Host ID : 1 > Host timestamp : 115601 > Score : 3400 > Engine status : {"vm": "up", "health": "bad", > "detail": "Paused", "reason": "bad vm status"} > Hostname : > Local maintenance : False > stopped : False > crc32 : 68efbf40 > conf_on_shared_storage : True > local_conf_timestamp : 115601 > Status up-to-date : True > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=115601 (Tue Feb 9 18:25:48 2021) > host-id=1 > score=3400 > vm_conf_refresh_time=115601 (Tue Feb 9 18:25:48 2021) > conf_on_shared_storage=True > maintenance=False > state=EngineStarting > stopped=False > > > Here is a chunk in agent.log that is a bit perplexing. I'm not too sure > what it means that the VM doesn't exist. Storage is correctly mounted, > everything looks fully operational. I can see the HostedEngine disk > available to the Host. > > MainThread::INFO::2021-02-09 > 18:08:13,843::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) > Current state EngineDown (score: 3400) > MainThread::INFO::2021-02-09 > 18:08:23,864::states::467::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) > Engine down and local host has best score (3400), attempting to start > engine VM > MainThread::INFO::2021-02-09 > 18:08:23,894::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) > Success, was notification of state_transition (EngineDown-EngineStart) > sent? ignored > MainThread::INFO::2021-02-09 > 18:08:23,983::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) > Current state EngineStart (score: 3400) > MainThread::INFO::2021-02-09 > 18:08:24,000::hosted_engine::895::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state) > Ensuring VDSM state is clear for engine VM > MainThread::INFO::2021-02-09 > 18:08:24,005::hosted_engine::907::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state) > Vdsm state for VM clean > MainThread::INFO::2021-02-09 > 18:08:24,005::hosted_engine::853::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) > Starting vm using `/usr/sbin/hosted-engine --vm-start` > MainThread::INFO::2021-02-09 > 18:08:24,519::hosted_engine::862::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) > stdout: VM in WaitForLaunch > > MainThread::INFO::2021-02-09 > 18:08:24,519::hosted_engine::863::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) > stderr: Command VM.getStats with args {'vmID': > '74b3c839-c89c-4857-ada0-95715672348a'} failed: > (code=1, message=Virtual machine does not exist: {'vmId': > '74b3c839-c89c-4857-ada0-95715672348a'}) > > MainThread::INFO::2021-02-09 > 18:08:24,519::hosted_engine::875::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm) > Engine VM started on localhost > MainThread::INFO::2021-02-09 > 18:08:24,552::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify) > Success, was notification of state_transition (EngineStart-EngineStarting) > sent? ignored > MainThread::INFO::2021-02-09 > 18:08:24,565::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) > Current state EngineStarting (score: 3400) > MainThread::INFO::2021-02-09 > 18:08:34,585::states::736::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) > VM is powering up.. > MainThread::INFO::2021-02-09 > 18:08:34,590::state_decorators::99::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check) > Timeout set to Tue Feb 9 18:18:34 2021 while transitioning <class > 'ovirt_hosted_engine_ha.agent.states.EngineStarting'> -> <class > 'ovirt_hosted_engine_ha.agent.states.EngineStarting'> > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/UDKODQL5A4NNIWJMONVYTFIGC3256URS/ >
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/R7UKX52RCOTB2QVRKGLCWAAZJXA3IBBK/