Additionally, When agent tries to boot up the engine, I am able to get following status:
[root@ovirt-sj-02 images]# hosted-engine --vm-status --== Host ovirt-sj-01.ictv.com (id: 1) status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : ovirt-sj-01.ictv.com Host ID : 1 Engine status : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"} Score : 3400 stopped : False Local maintenance : False crc32 : f4f95c83 local_conf_timestamp : 4285 Host timestamp : 4285 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=4285 (Mon Aug 19 15:11:19 2019) host-id=1 score=3400 vm_conf_refresh_time=4285 (Mon Aug 19 15:11:20 2019) conf_on_shared_storage=True maintenance=False state=EngineStarting stopped=False --== Host ovirt-sj-02.ictv.com (id: 2) status ==-- conf_on_shared_storage : True Status up-to-date : True Hostname : ovirt-sj-02.ictv.com Host ID : 2 Engine status : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "Down"} Score : 3400 stopped : False Local maintenance : False crc32 : c2669fe8 local_conf_timestamp : 4153 Host timestamp : 4153 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=4153 (Mon Aug 19 15:09:47 2019) host-id=2 score=3400 vm_conf_refresh_time=4153 (Mon Aug 19 15:09:47 2019) conf_on_shared_storage=True maintenance=False state=EngineStart stopped=False From: "Vrgotic, Marko" <m.vrgo...@activevideo.com> Date: Monday, 19 August 2019 at 17:17 To: "users@ovirt.org" <users@ovirt.org> Subject: Issues with oVirt-Engine start - oVirt 4.3.4 Dear oVirt, While working on a procedure to get the NFS v4 mount from Netapp, working on oVIrt, following steps came out to be the way to go in regards of setting it up for oVIrt SHE and VM Guests: * mkdir /mnt/rhevstore * mount -t nfs 10.20.30.40:/ovirt_hosted_engine /mnt/rhevstore * chown -R 36.36 /mnt/rhevstore * chmod -R 755 /mnt/rhevstore * umount /mnt/rhevstore This works fine, and it needs to be executed on each Hypervisor, before its provisioned into oVirt. However, just today I have discovered that command chmod -R 755 /mnt/rhevstore, if executed on new to be added Hypervisor, after oVirt is already running, it brings the oVirt Engine into broken state. The moment I executed the above on 3rd Hypervisor, before provisioning it into oVirt, following occurred: * Engine threw following error: * 2019-08-19 13:16:31,425Z ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStatusVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-82) [] Failed in 'SpmStatusVDS' method * Connection was lost: * packet_write_wait: Connection to 10.210.11.10 port 22: Broken pipe * And VDSM on SHE Hosting hypervisor started logging errors like: * 2019-08-19 15:00:52,340+0000 INFO (jsonrpc/0) [jsonrpc.JsonRpcServer] RPC call Host.getAllVmStats succeeded in 0.00 seconds (__init__:312) * 2019-08-19 15:00:53,865+0000 WARN (vdsm.Scheduler) [Executor] Worker blocked: <Worker name=periodic/2 running <Task <Operation action=<vdsm.virt.sampling.HostMonitor object at 0x7f59442c3d50> at 0x7f59442c3b90> timeout=15, duration=225.00 at 0x7f592476df90> task#=578 at 0x7f59442ef910>, traceback: * File: "/usr/lib64/python2.7/threading.py", line 785, in __bootstrap * self.__bootstrap_inner() * File: "/usr/lib64/python2.7/threading.py", line 812, in __bootstrap_inner * self.run() * File: "/usr/lib64/python2.7/threading.py", line 765, in run * self.__target(*self.__args, **self.__kwargs) * File: "/usr/lib/python2.7/site-packages/vdsm/common/concurrent.py", line 195, in run * ret = func(*args, **kwargs) * File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 301, in _run * self._execute_task() * File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 315, in _execute_task * task() * File: "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 391, in __call__ * self._callable() * File: "/usr/lib/python2.7/site-packages/vdsm/virt/periodic.py", line 186, in __call__ * self._func() * File: "/usr/lib/python2.7/site-packages/vdsm/virt/sampling.py", line 481, in __call__ * stats = hostapi.get_stats(self._cif, self._samples.stats()) * File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 79, in get_stats * ret['haStats'] = _getHaInfo() * File: "/usr/lib/python2.7/site-packages/vdsm/host/api.py", line 177, in _getHaInfo * stats = instance.get_all_stats() * File: "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/client/client.py", line 94, in get_all_stats * stats = broker.get_stats_from_storage() * File: "/usr/lib/python2.7/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 143, in get_stats_from_storage * result = self._proxy.get_stats() * File: "/usr/lib64/python2.7/xmlrpclib.py", line 1233, in __call__ * return self.__send(self.__name, args) * File: "/usr/lib64/python2.7/xmlrpclib.py", line 1591, in __request * verbose=self.__verbose * File: "/usr/lib64/python2.7/xmlrpclib.py", line 1273, in request * return self.single_request(host, handler, request_body, verbose) * File: "/usr/lib64/python2.7/xmlrpclib.py", line 1303, in single_request * response = h.getresponse(buffering=True) * File: "/usr/lib64/python2.7/httplib.py", line 1113, in getresponse * response.begin() * File: "/usr/lib64/python2.7/httplib.py", line 444, in begin * version, status, reason = self._read_status() * File: "/usr/lib64/python2.7/httplib.py", line 400, in _read_status * line = self.fp.readline(_MAXLINE + 1) * File: "/usr/lib64/python2.7/socket.py", line 476, in readline * data = self._sock.recv(self._rbufsize) (executor:363) * 2019-08-19 15:00:54,103+0000 INFO (jsonrpc/1) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:312) I am unable to boot the Engine VM – it end up in Status ForceStop Hosted-engine –vm-status shows: [root@ovirt-sj-02 ~]# hosted-engine --vm-status The hosted engine configuration has not been retrieved from shared storage. Please ensure that ovirt-ha-agent is running and the storage server is reachable. But, storage is mounted and reachable, as well as ovirt-ha-agent running: [root@ovirt-sj-02 ~]# systemctl status ovirt-ha-agent ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled) Active: active (running) since Mon 2019-08-19 14:57:07 UTC; 23s ago Main PID: 43388 (ovirt-ha-agent) Tasks: 2 CGroup: /system.slice/ovirt-ha-agent.service └─43388 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent Can somebody help me with what to do? — — — Met vriendelijke groet / Kind regards, Marko Vrgotic
_______________________________________________ Users mailing list -- users@ovirt.org To unsubscribe send an email to users-le...@ovirt.org Privacy Statement: https://www.ovirt.org/site/privacy-policy/ oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/users@ovirt.org/message/LVKN4CF2AY27Z7D7WX4BFGLMIVBH7YP2/