Hi! 
In the last few days I am having problem with Hosted-Engine, it keeps 
restarting. Sometimes after few minutes, sometimes after few hours..
I haven't done any changes on oVirt or network in that time.
Version is 4.4.10.7-1.el8. (this was also installation version)

Here are the logs:
Agent.log------------------------------
MainThread::INFO::2022-08-21 
09:48:36,200::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineUp (score: 2440)
MainThread::INFO::2022-08-21 
09:48:36,200::hosted_engine::525::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Best remote host node3.ovirt.example.org (id: 3, score: 2440)
MainThread::ERROR::2022-08-21 
09:48:46,212::states::398::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 Host node3.ovirt.example.org (id 3) score is significantly better than local 
score, shutting down VM on this host
MainThread::INFO::2022-08-21 
09:48:46,641::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
 Success, was notification of state_transition (EngineUp-EngineStop) sent? 
ignored
MainThread::INFO::2022-08-21 
09:48:46,706::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineStop (score: 3400)
MainThread::INFO::2022-08-21 
09:48:46,706::hosted_engine::525::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Best remote host node3.ovirt.example.org (id: 3, score: 3400)
MainThread::INFO::2022-08-21 
09:48:56,714::hosted_engine::934::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
 Shutting down vm using `/usr/sbin/hosted-engine --vm-shutdown`
MainThread::INFO::2022-08-21 
09:48:56,871::hosted_engine::941::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
 stdout: 
MainThread::INFO::2022-08-21 
09:48:56,871::hosted_engine::942::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
 stderr: 
MainThread::ERROR::2022-08-21 
09:48:56,871::hosted_engine::950::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_stop_engine_vm)
 Engine VM stopped on localhost
MainThread::INFO::2022-08-21 
09:48:56,880::state_decorators::102::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
 Timeout set to Sun Aug 21 09:53:56 2022 while transitioning <class 
'ovirt_hosted_engine_ha.agent.states.EngineStop'> -> <class 
'ovirt_hosted_engine_ha.agent.states.EngineStop'>
MainThread::INFO::2022-08-21 
09:48:56,959::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineStop (score: 3400)
MainThread::INFO::2022-08-21 
09:49:06,977::states::537::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 Engine vm not running on local host
MainThread::INFO::2022-08-21 
09:49:06,983::state_decorators::95::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
 Timeout cleared while transitioning <class 
'ovirt_hosted_engine_ha.agent.states.EngineStop'> -> <class 
'ovirt_hosted_engine_ha.agent.states.EngineDown'>
MainThread::INFO::2022-08-21 
09:49:07,173::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
 Success, was notification of state_transition (EngineStop-EngineDown) sent? 
ignored
MainThread::INFO::2022-08-21 
09:49:07,795::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineDown (score: 3400)
MainThread::INFO::2022-08-21 
09:49:16,811::states::472::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 Engine down and local host has best score (3400), attempting to start engine VM
MainThread::INFO::2022-08-21 
09:49:16,998::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
 Success, was notification of state_transition (EngineDown-EngineStart) sent? 
ignored
MainThread::INFO::2022-08-21 
09:49:17,179::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineStart (score: 3400)
MainThread::INFO::2022-08-21 
09:49:17,195::hosted_engine::895::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
 Ensuring VDSM state is clear for engine VM
MainThread::INFO::2022-08-21 
09:49:17,200::hosted_engine::915::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
 Cleaning state for non-running VM
MainThread::INFO::2022-08-21 
09:49:18,211::hosted_engine::907::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_clean_vdsm_state)
 Vdsm state for VM clean
MainThread::INFO::2022-08-21 
09:49:18,212::hosted_engine::853::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
 Starting vm using `/usr/sbin/hosted-engine --vm-start`
MainThread::INFO::2022-08-21 
09:49:18,814::hosted_engine::862::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
 stdout: VM in WaitForLaunch

MainThread::INFO::2022-08-21 
09:49:18,814::hosted_engine::863::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
 stderr: Command VM.getStats with args {'vmID': 
'cc7931ff-8124-4724-9242-abea2ab5bf42'} failed:
(code=1, message=Virtual machine does not exist: {'vmId': 
'cc7931ff-8124-4724-9242-abea2ab5bf42'})

MainThread::INFO::2022-08-21 
09:49:18,814::hosted_engine::875::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_start_engine_vm)
 Engine VM started on localhost
MainThread::INFO::2022-08-21 
09:49:18,999::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
 Success, was notification of state_transition (EngineStart-EngineStarting) 
sent? ignored
MainThread::INFO::2022-08-21 
09:49:19,008::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineStarting (score: 3400)
MainThread::INFO::2022-08-21 
09:49:29,027::states::741::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 VM is powering up..
MainThread::INFO::2022-08-21 
09:49:29,033::state_decorators::102::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
 Timeout set to Sun Aug 21 09:59:29 2022 while transitioning <class 
'ovirt_hosted_engine_ha.agent.states.EngineStarting'> -> <class 
'ovirt_hosted_engine_ha.agent.states.EngineStarting'>
MainThread::INFO::2022-08-21 
09:49:29,109::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineStarting (score: 3400)
MainThread::INFO::2022-08-21 
09:49:38,121::states::741::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 VM is powering up..
MainThread::INFO::2022-08-21 
09:49:38,195::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineStarting (score: 3400)
MainThread::INFO::2022-08-21 
09:49:48,218::state_decorators::95::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(check)
 Timeout cleared while transitioning <class 
'ovirt_hosted_engine_ha.agent.states.EngineStarting'> -> <class 
'ovirt_hosted_engine_ha.agent.states.EngineUp'>
MainThread::INFO::2022-08-21 
09:49:48,403::brokerlink::73::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
 Success, was notification of state_transition (EngineStarting-EngineUp) sent? 
ignored
MainThread::INFO::2022-08-21 
09:49:48,713::hosted_engine::517::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop)
 Current state EngineUp (score: 3400)
MainThread::INFO::2022-08-21 
09:49:58,725::states::406::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
 Engine vm running on localhost



Broker.log------------------------------
Thread-4::INFO::2022-08-21 
09:47:59,342::cpu_load_no_engine::142::cpu_load_no_engine.CpuLoadNoEngine::(calculate_load)
 System load total=0.0241, engine=0.0013, non-engine=0.0228
Thread-3::INFO::2022-08-21 
09:48:01,311::mem_free::51::mem_free.MemFree::(action) memFree: 96106
Thread-5::INFO::2022-08-21 
09:48:05,612::engine_health::246::engine_health.EngineHealth::(_result_from_stats)
 VM is up on this host with healthy engine
Thread-2::INFO::2022-08-21 
09:48:08,591::mgmt_bridge::65::mgmt_bridge.MgmtBridge::(action) Found bridge 
ovirtmgmt in up state
Thread-1::WARNING::2022-08-21 
09:48:10,352::network::121::network.Network::(_dns) DNS query failed: 
; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> +tries=1 +time=5 +tcp
;; global options: +cmd
;; connection timed out; no servers could be reached
 
Thread-1::WARNING::2022-08-21 
09:48:10,352::network::92::network.Network::(action) Failed to verify network 
status, (4 out of 5)
Thread-3::INFO::2022-08-21 
09:48:11,389::mem_free::51::mem_free.MemFree::(action) memFree: 96089
Thread-5::INFO::2022-08-21 
09:48:15,707::engine_health::246::engine_health.EngineHealth::(_result_from_stats)
 VM is up on this host with healthy engine
Thread-2::INFO::2022-08-21 
09:48:18,662::mgmt_bridge::65::mgmt_bridge.MgmtBridge::(action) Found bridge 
ovirtmgmt in up state
Thread-1::WARNING::2022-08-21 
09:48:18,879::network::121::network.Network::(_dns) DNS query failed: 
; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> +tries=1 +time=5 +tcp
;; global options: +cmd
;; connection timed out; no servers could be reached
 
Thread-3::INFO::2022-08-21 
09:48:21,467::mem_free::51::mem_free.MemFree::(action) memFree: 96072
Thread-1::WARNING::2022-08-21 
09:48:24,904::network::121::network.Network::(_dns) DNS query failed: 
; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> +tries=1 +time=5 +tcp
;; global options: +cmd
;; connection timed out; no servers could be reached
 
Thread-5::INFO::2022-08-21 
09:48:25,808::engine_health::246::engine_health.EngineHealth::(_result_from_stats)
 VM is up on this host with healthy engine
Thread-2::INFO::2022-08-21 
09:48:28,740::mgmt_bridge::65::mgmt_bridge.MgmtBridge::(action) Found bridge 
ovirtmgmt in up state
Thread-1::WARNING::2022-08-21 
09:48:30,416::network::121::network.Network::(_dns) DNS query failed: 
; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> +tries=1 +time=5 +tcp
;; global options: +cmd
;; connection timed out; no servers could be reached
 
Thread-1::WARNING::2022-08-21 
09:48:30,416::network::92::network.Network::(action) Failed to verify network 
status, (2 out of 5)
Thread-3::INFO::2022-08-21 
09:48:31,545::mem_free::51::mem_free.MemFree::(action) memFree: 96064
Thread-5::INFO::2022-08-21 
09:48:35,909::engine_health::246::engine_health.EngineHealth::(_result_from_stats)
 VM is up on this host with healthy engine
Thread-1::WARNING::2022-08-21 
09:48:35,940::network::121::network.Network::(_dns) DNS query failed: 
; <<>> DiG 9.11.36-RedHat-9.11.36-3.el8 <<>> +tries=1 +time=5 +tcp
;; global options: +cmd
;; connection timed out; no servers could be reached
 
Thread-1::WARNING::2022-08-21 
09:48:37,480::network::92::network.Network::(action) Failed to verify network 
status, (4 out of 5)
Thread-2::INFO::2022-08-21 
09:48:38,809::mgmt_bridge::65::mgmt_bridge.MgmtBridge::(action) Found bridge 
ovirtmgmt in up state
Thread-3::INFO::2022-08-21 
09:48:41,623::mem_free::51::mem_free.MemFree::(action) memFree: 96014
Thread-1::INFO::2022-08-21 09:48:42,549::network::88::network.Network::(action) 
Successfully verified network status
Thread-5::INFO::2022-08-21 
09:48:46,011::engine_health::246::engine_health.EngineHealth::(_result_from_stats)
 VM is up on this host with healthy engine
Listener::ERROR::2022-08-21 
09:48:46,639::notifications::42::ovirt_hosted_engine_ha.broker.notifications.Notifications::(send_email)
 (530, b'5.7.1 Authentication required', 'ale...@example.org.hr')

At first I thought that it is related to this bugs.
https://lists.ovirt.org/archives/list/users@ovirt.org/thread/2HTD5WR43M5MUTEDMM4HRFBADIXEQNB4/
https://bugzilla.redhat.com/show_bug.cgi?id=1984356

But in this oVirt version that bug should already be solved.
I was trying to monitor network, but this error keeps happening even if network 
load is low.
I did try to do continuous dig and ping commands on VM-s running on same host 
as Hosted Engine, and did not have any network problems, not even one 
connection drop.

Any solutions or next steps i should try?
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/GXQUEKBJ5DPNMDJVMRFEE5OJNX2RBEZS/

Reply via email to