On Tue, Sep 18, 2018 at 9:21 AM <a...@pioner.kz> wrote:

> Good day for all.
> I have some issues with Ovirt 4.2.6. But now the main this of it:
> I have two Centos 7 Nodes with same config and last Ovirt 4.2.6 with
> Hostedengine with disk on NFS storage.
> Also some of virtual machines working good.
> But, when HostedEngine running on one node (srv02.local) everything is
> fine.
> After migrating to another node (srv00.local), i see that agent cannot to
> check livelinness of HostedEngine. After few minutes HostedEngine going to
> reboot and after some time i see some situation. After migration to another
> node (srv00.local) all looks OK.
>
> hosted-engine --vm-status commang when HosterEngine on srv00 node:
> --== Host 1 status ==--
>
> conf_on_shared_storage             : True
> Status up-to-date                  : True
> Hostname                           : srv02.local
> Host ID                            : 1
> Engine status                      : {"reason": "vm not running on this
> host", "health": "bad", "vm": "down_unexpected", "detail": "unknown"}
> Score                              : 0
> stopped                            : False
> Local maintenance                  : False
> crc32                              : ecc7ad2d
> local_conf_timestamp               : 78328
> Host timestamp                     : 78328
> Extra metadata (valid at timestamp):
>         metadata_parse_version=1
>         metadata_feature_version=1
>         timestamp=78328 (Tue Sep 18 12:44:18 2018)
>         host-id=1
>         score=0
>         vm_conf_refresh_time=78328 (Tue Sep 18 12:44:18 2018)
>         conf_on_shared_storage=True
>         maintenance=False
>         state=EngineUnexpectedlyDown
>         stopped=False
>         timeout=Fri Jan  2 03:49:58 1970
>
>
> --== Host 2 status ==--
>
> conf_on_shared_storage             : True
> Status up-to-date                  : True
> Hostname                           : srv00.local
> Host ID                            : 2
> Engine status                      : {"reason": "failed liveliness check",
> "health": "bad", "vm": "up", "detail": "Up"}
>

vm: up refers to vm status at virt level polling a local vdsm, health: bad
refers instead to a live check on the engine portal over http.
Bad name resolution or network routing issues can cause this. I'd suggest
to check if everything is fine on network side.



> Score                              : 3400
> stopped                            : False
> Local maintenance                  : False
> crc32                              : 1d62b106
> local_conf_timestamp               : 326288
> Host timestamp                     : 326288
> Extra metadata (valid at timestamp):
>         metadata_parse_version=1
>         metadata_feature_version=1
>         timestamp=326288 (Tue Sep 18 12:44:21 2018)
>         host-id=2
>         score=3400
>         vm_conf_refresh_time=326288 (Tue Sep 18 12:44:21 2018)
>         conf_on_shared_storage=True
>         maintenance=False
>         state=EngineStarting
>         stopped=False
>
> Log agent.log from srv00.local:
>
> MainThread::INFO::2018-09-18
> 12:40:51,749::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
> ngine::(consume) VM is powering up..
> MainThread::INFO::2018-09-18
> 12:40:52,052::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
> MainThread::INFO::2018-09-18
> 12:41:01,066::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
> ngine::(consume) VM is powering up..
> MainThread::INFO::2018-09-18
> 12:41:01,374::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
> MainThread::INFO::2018-09-18
> 12:41:11,393::state_machine::169::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(refresh) Global metadata: {'maintenance': False}
> MainThread::INFO::2018-09-18
> 12:41:11,393::state_machine::174::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(refresh) Host srv02.local.pioner.kz (id 1):
> {'conf_on_shared_storage': True, 'extra': 'meta
> data_parse_version=1\nmetadata_feature_version=1\ntimestamp=78128 (Tue Sep
> 18 12:40:58 2018)\nhost-id=1\ns
> core=0\nvm_conf_refresh_time=78128 (Tue Sep 18 12:40:58
> 2018)\nconf_on_shared_storage=True\nmaintenance=Fa
> lse\nstate=EngineUnexpectedlyDown\nstopped=False\ntimeout=Fri Jan  2
> 03:49:58 1970\n', 'hostname': 'srv02.
> local.pioner.kz', 'alive': True, 'host-id': 1, 'engine-status':
> {'reason': 'vm not running on this host',
> 'health': 'bad', 'vm': 'down_unexpected', 'detail': 'unknown'}, 'score':
> 0, 'stopped': False, 'maintenance
> ': False, 'crc32': 'e18e3f22', 'local_conf_timestamp': 78128, 'host-ts':
> 78128}
> MainThread::INFO::2018-09-18
> 12:41:11,393::state_machine::177::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(refresh) Local (id 2): {'engine-health': {'reason': 'failed
> liveliness check', 'health': 'b
> ad', 'vm': 'up', 'detail': 'Up'}, 'bridge': True, 'mem-free': 12763.0,
> 'maintenance': False, 'cpu-load': 0
> .0364, 'gateway': 1.0, 'storage-domain': True}
> MainThread::INFO::2018-09-18
> 12:41:11,393::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
> ngine::(consume) VM is powering up..
> MainThread::INFO::2018-09-18
> 12:41:11,703::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
> MainThread::INFO::2018-09-18
> 12:41:21,716::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
> ngine::(consume) VM is powering up..
> MainThread::INFO::2018-09-18
> 12:41:22,020::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
> MainThread::INFO::2018-09-18
> 12:41:31,033::states::779::ovirt_hosted_engine_ha.agent.hosted_engine.HostedE
> ngine::(consume) VM is powering up..
> MainThread::INFO::2018-09-18
> 12:41:31,344::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.
> HostedEngine::(_monitoring_loop) Current state EngineStarting (score: 3400)
> As we can see, agent thinking that HostedEngine just in powering up mode.
> I cannot to do anythink with it. I allready reinstalled many times srv00
> node without success.
> One time i even has to uninstall ovirt* and vdsm* software. Also here one
> interesting point, after installing just "yum install
> http://resources.ovirt.org/pub/yum-repo/ovirt-release42.rpm"; on this node
> i try to install this node from engine web interface with "Deploy" action.
> But, installation was unsuccesfull, before i didnt install
> ovirt-hosted-engine-ha on this node. I dont see in documentation that its
> need bofore installation of new hosts. But this is for information and
> checking.  After installing ovirt-hosted-engine-ha node was installed with
> HostedEngine support. But the main issue not changed.
> Thanks in advance for help.
> BR,
> Alexandr
> _______________________________________________
> Users mailing list -- users@ovirt.org
> To unsubscribe send an email to users-le...@ovirt.org
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/users@ovirt.org/message/7KGDIM3X3G4QRCRQKQENVKD2JWSOFGK2/
>
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/SYA2I5R7X6TA5N25Z6TMCDPRJLCNO5JZ/

Reply via email to