On Tue, Sep 18, 2018 at 9:21 AM <a...@pioner.kz> wrote: > Good day for all. > I have some issues with Ovirt 4.2.6. But now the main this of it: > I have two Centos 7 Nodes with same config and last Ovirt 4.2.6 with > Hostedengine with disk on NFS storage. > Also some of virtual machines working good. > But, when HostedEngine running on one node (srv02.local) everything is > fine. > After migrating to another node (srv00.local), i see that agent cannot to > check livelinness of HostedEngine. After few minutes HostedEngine going to > reboot and after some time i see some situation. After migration to another > node (srv00.local) all looks OK. > > hosted-engine --vm-status commang when HosterEngine on srv00 node: > --== Host 1 status ==-- > > conf_on_shared_storage : True > Status up-to-date : True > Hostname : srv02.local > Host ID : 1 > Engine status : {"reason": "vm not running on this > host", "health": "bad", "vm": "down_unexpected", "detail": "unknown"} > Score : 0 > stopped : False > Local maintenance : False > crc32 : ecc7ad2d > local_conf_timestamp : 78328 > Host timestamp : 78328 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=78328 (Tue Sep 18 12:44:18 2018) > host-id=1 > score=0 > vm_conf_refresh_time=78328 (Tue Sep 18 12:44:18 2018) > conf_on_shared_storage=True > maintenance=False > state=EngineUnexpectedlyDown > stopped=False > timeout=Fri Jan 2 03:49:58 1970 > > > --== Host 2 status ==-- > > conf_on_shared_storage : True > Status up-to-date : True > Hostname : srv00.local > Host ID : 2 > Engine status : {"reason": "failed liveliness check", > "health": "bad", "vm": "up", "detail": "Up"} >
vm: up refers to vm status at virt level polling a local vdsm, health: bad refers instead to a live check on the engine portal over http. Bad name resolution or network routing issues can cause this. I'd suggest to check if everything is fine on network side. > Score : 3400 > stopped : False > Local maintenance : False > crc32 : 1d62b106 > local_conf_timestamp : 326288 > Host timestamp : 326288 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=326288 (Tue Sep 18 12:44:21 2018) > host-id=2 > score=3400 > vm_conf_refresh_time=326288 (Tue Sep 18 12:44:21 2018) > conf_on_shared_storage=True > maintenance=False > state=EngineStarting > stopped=False > > Log agent.log from srv00.local: > > MainThread::INFO::2018-09-18 > 12:40:51,749::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE > ngine::(consume) VM is powering up.. > MainThread::INFO::2018-09-18 > 12:40:52,052::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) > MainThread::INFO::2018-09-18 > 12:41:01,066::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE > ngine::(consume) VM is powering up.. > MainThread::INFO::2018-09-18 > 12:41:01,374::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) > MainThread::INFO::2018-09-18 > 12:41:11,393::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(refresh) Global metadata: {'maintenance': False} > MainThread::INFO::2018-09-18 > 12:41:11,393::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(refresh) Host srv02.local.pioner.kz (id 1): > {'conf_on_shared_storage': True, 'extra': 'meta > data_parse_version=1\nmetadata_feature_version=1\ntimestamp=78128 (Tue Sep > 18 12:40:58 2018)\nhost-id=1\ns > core=0\nvm_conf_refresh_time=78128 (Tue Sep 18 12:40:58 > 2018)\nconf_on_shared_storage=True\nmaintenance=Fa > lse\nstate=EngineUnexpectedlyDown\nstopped=False\ntimeout=Fri Jan 2 > 03:49:58 1970\n', 'hostname': 'srv02. > local.pioner.kz', 'alive': True, 'host-id': 1, 'engine-status': > {'reason': 'vm not running on this host', > 'health': 'bad', 'vm': 'down_unexpected', 'detail': 'unknown'}, 'score': > 0, 'stopped': False, 'maintenance > ': False, 'crc32': 'e18e3f22', 'local_conf_timestamp': 78128, 'host-ts': > 78128} > MainThread::INFO::2018-09-18 > 12:41:11,393::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(refresh) Local (id 2): {'engine-health': {'reason': 'failed > liveliness check', 'health': 'b > ad', 'vm': 'up', 'detail': 'Up'}, 'bridge': True, 'mem-free': 12763.0, > 'maintenance': False, 'cpu-load': 0 > .0364, 'gateway': 1.0, 'storage-domain': True} > MainThread::INFO::2018-09-18 > 12:41:11,393::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE > ngine::(consume) VM is powering up.. > MainThread::INFO::2018-09-18 > 12:41:11,703::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) > MainThread::INFO::2018-09-18 > 12:41:21,716::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE > ngine::(consume) VM is powering up.. > MainThread::INFO::2018-09-18 > 12:41:22,020::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) > MainThread::INFO::2018-09-18 > 12:41:31,033::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE > ngine::(consume) VM is powering up.. > MainThread::INFO::2018-09-18 > 12:41:31,344::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine. > HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400) > As we can see, agent thinking that HostedEngine just in powering up mode. > I cannot to do anythink with it. I allready reinstalled many times srv00 > node without success. > One time i even has to uninstall ovirt* and vdsm* software. Also here one > interesting point, after installing just "yum install > http://resources.ovirt.org/pub/yum-repo/ovirt-release42.rpm" on this node > i try to install this node from engine web interface with "Deploy" action. > But, installation was unsuccesfull, before i didnt install > ovirt-hosted-engine-ha on this node. I dont see in documentation that its > need bofore installation of new hosts. But this is for information and > checking. After installing ovirt-hosted-engine-ha node was installed with > HostedEngine support. But the main issue not changed. > Thanks in advance for help. > BR, > Alexandr > _______________________________________________ > Users mailing list -- users@ovirt.org > To unsubscribe send an email to users-le...@ovirt.org > Privacy Statement: https://www.ovirt.org/site/privacy-policy/ > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/users@ovirt.org/message/7KGDIM3X3G4QRCRQKQENVKD2JWSOFGK2/ >
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/SYA2I5R7X6TA5N25Z6TMCDPRJLCNO5JZ/