>From the agent.log, MainThread::INFO::2017-06-15 11:16:50,583::states::473::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) Engine vm is running on host ovirt-hyp-02.reis.com (id 2)
It looks like the HE VM was started successfully? Is it possible that the ovirt-engine service could not be started on the HE VM. Could you try to start the HE vm using below and then logging into the VM console. #hosted-engine --vm-start Also, please check # gluster volume status engine # gluster volume heal engine info Please also check if there are errors in gluster mount logs - at /var/log/glusterfs/rhev-data-center-mnt..<engine>.log On Thu, Jun 15, 2017 at 8:53 PM, Joel Diaz <mrjoeld...@gmail.com> wrote: > Sorry. I forgot to attached the requested logs in the previous email. > > Thanks, > > On Jun 15, 2017 9:38 AM, "Joel Diaz" <mrjoeld...@gmail.com> wrote: > > Good morning, > > Requested info below. Along with some additional info. > > You'll notice the data volume is not mounted. > > Any help in getting HE back running would be greatly appreciated. > > Thank you, > > Joel > > [root@ovirt-hyp-01 ~]# hosted-engine --vm-status > > > > > > --== Host 1 status ==-- > > > > conf_on_shared_storage : True > > Status up-to-date : False > > Hostname : ovirt-hyp-01.example.lan > > Host ID : 1 > > Engine status : unknown stale-data > > Score : 3400 > > stopped : False > > Local maintenance : False > > crc32 : 5558a7d3 > > local_conf_timestamp : 20356 > > Host timestamp : 20341 > > Extra metadata (valid at timestamp): > > metadata_parse_version=1 > > metadata_feature_version=1 > > timestamp=20341 (Fri Jun 9 14:38:57 2017) > > host-id=1 > > score=3400 > > vm_conf_refresh_time=20356 (Fri Jun 9 14:39:11 2017) > > conf_on_shared_storage=True > > maintenance=False > > state=EngineDown > > stopped=False > > > > > > --== Host 2 status ==-- > > > > conf_on_shared_storage : True > > Status up-to-date : False > > Hostname : ovirt-hyp-02.example.lan > > Host ID : 2 > > Engine status : unknown stale-data > > Score : 3400 > > stopped : False > > Local maintenance : False > > crc32 : 936d4cf3 > > local_conf_timestamp : 20351 > > Host timestamp : 20337 > > Extra metadata (valid at timestamp): > > metadata_parse_version=1 > > metadata_feature_version=1 > > timestamp=20337 (Fri Jun 9 14:39:03 2017) > > host-id=2 > > score=3400 > > vm_conf_refresh_time=20351 (Fri Jun 9 14:39:17 2017) > > conf_on_shared_storage=True > > maintenance=False > > state=EngineDown > > stopped=False > > > > > > --== Host 3 status ==-- > > > > conf_on_shared_storage : True > > Status up-to-date : False > > Hostname : ovirt-hyp-03.example.lan > > Host ID : 3 > > Engine status : unknown stale-data > > Score : 3400 > > stopped : False > > Local maintenance : False > > crc32 : f646334e > > local_conf_timestamp : 20391 > > Host timestamp : 20377 > > Extra metadata (valid at timestamp): > > metadata_parse_version=1 > > metadata_feature_version=1 > > timestamp=20377 (Fri Jun 9 14:39:37 2017) > > host-id=3 > > score=3400 > > vm_conf_refresh_time=20391 (Fri Jun 9 14:39:51 2017) > > conf_on_shared_storage=True > > maintenance=False > > state=EngineStop > > stopped=False > > timeout=Thu Jan 1 00:43:08 1970 > > > > > > [root@ovirt-hyp-01 ~]# gluster peer status > > Number of Peers: 2 > > > > Hostname: 192.168.170.143 > > Uuid: b2b30d05-cf91-4567-92fd-022575e082f5 > > State: Peer in Cluster (Connected) > > Other names: > > 10.0.0.2 > > > > Hostname: 192.168.170.147 > > Uuid: 4e50acc4-f3cb-422d-b499-fb5796a53529 > > State: Peer in Cluster (Connected) > > Other names: > > 10.0.0.3 > > > > [root@ovirt-hyp-01 ~]# gluster volume info all > > > > Volume Name: data > > Type: Replicate > > Volume ID: 1d6bb110-9be4-4630-ae91-36ec1cf6cc02 > > Status: Started > > Snapshot Count: 0 > > Number of Bricks: 1 x (2 + 1) = 3 > > Transport-type: tcp > > Bricks: > > Brick1: 192.168.170.141:/gluster_bricks/data/data > > Brick2: 192.168.170.143:/gluster_bricks/data/data > > Brick3: 192.168.170.147:/gluster_bricks/data/data (arbiter) > > Options Reconfigured: > > nfs.disable: on > > performance.readdir-ahead: on > > transport.address-family: inet > > performance.quick-read: off > > performance.read-ahead: off > > performance.io-cache: off > > performance.stat-prefetch: off > > performance.low-prio-threads: 32 > > network.remote-dio: off > > cluster.eager-lock: enable > > cluster.quorum-type: auto > > cluster.server-quorum-type: server > > cluster.data-self-heal-algorithm: full > > cluster.locking-scheme: granular > > cluster.shd-max-threads: 8 > > cluster.shd-wait-qlength: 10000 > > features.shard: on > > user.cifs: off > > storage.owner-uid: 36 > > storage.owner-gid: 36 > > network.ping-timeout: 30 > > performance.strict-o-direct: on > > cluster.granular-entry-heal: enable > > > > Volume Name: engine > > Type: Replicate > > Volume ID: b160f0b2-8bd3-4ff2-a07c-134cab1519dd > > Status: Started > > Snapshot Count: 0 > > Number of Bricks: 1 x (2 + 1) = 3 > > Transport-type: tcp > > Bricks: > > Brick1: 192.168.170.141:/gluster_bricks/engine/engine > > Brick2: 192.168.170.143:/gluster_bricks/engine/engine > > Brick3: 192.168.170.147:/gluster_bricks/engine/engine (arbiter) > > Options Reconfigured: > > nfs.disable: on > > performance.readdir-ahead: on > > transport.address-family: inet > > performance.quick-read: off > > performance.read-ahead: off > > performance.io-cache: off > > performance.stat-prefetch: off > > performance.low-prio-threads: 32 > > network.remote-dio: off > > cluster.eager-lock: enable > > cluster.quorum-type: auto > > cluster.server-quorum-type: server > > cluster.data-self-heal-algorithm: full > > cluster.locking-scheme: granular > > cluster.shd-max-threads: 8 > > cluster.shd-wait-qlength: 10000 > > features.shard: on > > user.cifs: off > > storage.owner-uid: 36 > > storage.owner-gid: 36 > > network.ping-timeout: 30 > > performance.strict-o-direct: on > > cluster.granular-entry-heal: enable > > > > > > [root@ovirt-hyp-01 ~]# df -h > > Filesystem Size Used Avail Use% > Mounted on > > /dev/mapper/centos_ovirt--hyp--01-root 50G 4.1G 46G 9% / > > devtmpfs 7.7G 0 7.7G 0% /dev > > tmpfs 7.8G 0 7.8G 0% > /dev/shm > > tmpfs 7.8G 8.7M 7.7G 1% /run > > tmpfs 7.8G 0 7.8G 0% > /sys/fs/cgroup > > /dev/mapper/centos_ovirt--hyp--01-home 61G 33M 61G 1% /home > > /dev/mapper/gluster_vg_sdb-gluster_lv_engine 50G 7.6G 43G 16% > /gluster_bricks/engine > > /dev/mapper/gluster_vg_sdb-gluster_lv_data 730G 157G 574G 22% > /gluster_bricks/data > > /dev/sda1 497M 173M 325M 35% /boot > > ovirt-hyp-01.example.lan:engine 50G 7.6G 43G 16% > /rhev/data-center/mnt/glusterSD/ovirt-hyp-01.example.lan:engine > > tmpfs 1.6G 0 1.6G 0% > /run/user/0 > > > > [root@ovirt-hyp-01 ~]# systemctl list-unit-files|grep ovirt > > ovirt-ha-agent.service enabled > > ovirt-ha-broker.service enabled > > ovirt-imageio-daemon.service disabled > > ovirt-vmconsole-host-sshd.service enabled > > > > [root@ovirt-hyp-01 ~]# systemctl status ovirt-ha-agent.service > > ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability > Monitoring Agent > > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; > enabled; vendor preset: disabled) > > Active: active (running) since Thu 2017-06-15 08:56:15 EDT; 21min ago > > Main PID: 3150 (ovirt-ha-agent) > > CGroup: /system.slice/ovirt-ha-agent.service > > └─3150 /usr/bin/python > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent > --no-daemon > > > > Jun 15 08:56:15 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted > Engine High Availability Monitoring Agent. > > Jun 15 08:56:15 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted > Engine High Availability Monitoring Agent... > > Jun 15 09:17:18 ovirt-hyp-01.example.lan ovirt-ha-agent[3150]: > ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine > ERROR Engine VM stopped on localhost > > [root@ovirt-hyp-01 ‾]# systemctl status ovirt-ha-broker.service > > ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability > Communications Broker > > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; > enabled; vendor preset: disabled) > > Active: active (running) since Thu 2017-06-15 08:54:06 EDT; 24min ago > > Main PID: 968 (ovirt-ha-broker) > > CGroup: /system.slice/ovirt-ha-broker.service > > └─968 /usr/bin/python > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker > --no-daemon > > > > Jun 15 08:54:06 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted > Engine High Availability Communications Broker. > > Jun 15 08:54:06 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted > Engine High Availability Communications Broker... > > Jun 15 08:56:16 ovirt-hyp-01.example.lan ovirt-ha-broker[968]: > ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler > ERROR Error handling request, data: '...1b55bcf76' > > Traceback > (most recent call last): > > File > "/usr/lib/python2.7/site-packages/ovirt... > > Hint: Some lines were ellipsized, use -l to show in full. > > > > > > > > > > [root@ovirt-hyp-01 ‾]# systemctl restart ovirt-ha-agent.service > > [root@ovirt-hyp-01 ‾]# systemctl status ovirt-ha-agent.service > > ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability > Monitoring Agent > > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; > enabled; vendor preset: disabled) > > Active: active (running) since Thu 2017-06-15 09:19:21 EDT; 26s ago > > Main PID: 8563 (ovirt-ha-agent) > > CGroup: /system.slice/ovirt-ha-agent.service > > └─8563 /usr/bin/python > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent > --no-daemon > > > > Jun 15 09:19:21 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted > Engine High Availability Monitoring Agent. > > Jun 15 09:19:21 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted > Engine High Availability Monitoring Agent... > > [root@ovirt-hyp-01 ‾]# systemctl restart ovirt-ha-broker.service > > [root@ovirt-hyp-01 ‾]# systemctl status ovirt-ha-broker.service > > ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability > Communications Broker > > Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; > enabled; vendor preset: disabled) > > Active: active (running) since Thu 2017-06-15 09:20:59 EDT; 28s ago > > Main PID: 8844 (ovirt-ha-broker) > > CGroup: /system.slice/ovirt-ha-broker.service > > └─8844 /usr/bin/python > /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker > --no-daemon > > > > Jun 15 09:20:59 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted > Engine High Availability Communications Broker. > > Jun 15 09:20:59 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted > Engine High Availability Communications Broker... > > > On Jun 14, 2017 4:45 AM, "Sahina Bose" <sab...@redhat.com> wrote: > >> What's the output of "hosted-engine --vm-status" and "gluster volume >> status engine" tell you? Are all the bricks running as per gluster vol >> status? >> >> Can you try to restart the ovirt-ha-agent and ovirt-ha-broker services? >> >> If HE still has issues powering up, please provide agent.log and >> broker.log from /var/log/ovirt-hosted-engine-ha and gluster mount logs >> from /var/log/glusterfs/rhev-data-center-mnt <engine>.log >> >> On Thu, Jun 8, 2017 at 6:57 PM, Joel Diaz <mrjoeld...@gmail.com> wrote: >> >>> Good morning oVirt community, >>> >>> I'm running a three host gluster environment with hosted engine. >>> >>> Yesterday the engine went down and has not been able to come up >>> properly. It tries to start on all three host. >>> >>> I have two gluster volumes, data and engne. The data storage domian >>> volume is no longer mounted but the engine volume is up. I've restarted the >>> gluster service and make sure both volumes were running. The data volume >>> will not mount. >>> >>> How can I get the engine running properly again? >>> >>> Thanks, >>> >>> Joel >>> >>> _______________________________________________ >>> Users mailing list >>> Users@ovirt.org >>> http://lists.ovirt.org/mailman/listinfo/users >>> >>> >> >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users