Good morning, Requested info below. Along with some additional info.
You'll notice the data volume is not mounted. Any help in getting HE back running would be greatly appreciated. Thank you, Joel [root@ovirt-hyp-01 ~]# hosted-engine --vm-status --== Host 1 status ==-- conf_on_shared_storage : True Status up-to-date : False Hostname : ovirt-hyp-01.example.lan Host ID : 1 Engine status : unknown stale-data Score : 3400 stopped : False Local maintenance : False crc32 : 5558a7d3 local_conf_timestamp : 20356 Host timestamp : 20341 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=20341 (Fri Jun 9 14:38:57 2017) host-id=1 score=3400 vm_conf_refresh_time=20356 (Fri Jun 9 14:39:11 2017) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False --== Host 2 status ==-- conf_on_shared_storage : True Status up-to-date : False Hostname : ovirt-hyp-02.example.lan Host ID : 2 Engine status : unknown stale-data Score : 3400 stopped : False Local maintenance : False crc32 : 936d4cf3 local_conf_timestamp : 20351 Host timestamp : 20337 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=20337 (Fri Jun 9 14:39:03 2017) host-id=2 score=3400 vm_conf_refresh_time=20351 (Fri Jun 9 14:39:17 2017) conf_on_shared_storage=True maintenance=False state=EngineDown stopped=False --== Host 3 status ==-- conf_on_shared_storage : True Status up-to-date : False Hostname : ovirt-hyp-03.example.lan Host ID : 3 Engine status : unknown stale-data Score : 3400 stopped : False Local maintenance : False crc32 : f646334e local_conf_timestamp : 20391 Host timestamp : 20377 Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=20377 (Fri Jun 9 14:39:37 2017) host-id=3 score=3400 vm_conf_refresh_time=20391 (Fri Jun 9 14:39:51 2017) conf_on_shared_storage=True maintenance=False state=EngineStop stopped=False timeout=Thu Jan 1 00:43:08 1970 [root@ovirt-hyp-01 ~]# gluster peer status Number of Peers: 2 Hostname: 192.168.170.143 Uuid: b2b30d05-cf91-4567-92fd-022575e082f5 State: Peer in Cluster (Connected) Other names: 10.0.0.2 Hostname: 192.168.170.147 Uuid: 4e50acc4-f3cb-422d-b499-fb5796a53529 State: Peer in Cluster (Connected) Other names: 10.0.0.3 [root@ovirt-hyp-01 ~]# gluster volume info all Volume Name: data Type: Replicate Volume ID: 1d6bb110-9be4-4630-ae91-36ec1cf6cc02 Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: 192.168.170.141:/gluster_bricks/data/data Brick2: 192.168.170.143:/gluster_bricks/data/data Brick3: 192.168.170.147:/gluster_bricks/data/data (arbiter) Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off performance.low-prio-threads: 32 network.remote-dio: off cluster.eager-lock: enable cluster.quorum-type: auto cluster.server-quorum-type: server cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-max-threads: 8 cluster.shd-wait-qlength: 10000 features.shard: on user.cifs: off storage.owner-uid: 36 storage.owner-gid: 36 network.ping-timeout: 30 performance.strict-o-direct: on cluster.granular-entry-heal: enable Volume Name: engine Type: Replicate Volume ID: b160f0b2-8bd3-4ff2-a07c-134cab1519dd Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: 192.168.170.141:/gluster_bricks/engine/engine Brick2: 192.168.170.143:/gluster_bricks/engine/engine Brick3: 192.168.170.147:/gluster_bricks/engine/engine (arbiter) Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet performance.quick-read: off performance.read-ahead: off performance.io-cache: off performance.stat-prefetch: off performance.low-prio-threads: 32 network.remote-dio: off cluster.eager-lock: enable cluster.quorum-type: auto cluster.server-quorum-type: server cluster.data-self-heal-algorithm: full cluster.locking-scheme: granular cluster.shd-max-threads: 8 cluster.shd-wait-qlength: 10000 features.shard: on user.cifs: off storage.owner-uid: 36 storage.owner-gid: 36 network.ping-timeout: 30 performance.strict-o-direct: on cluster.granular-entry-heal: enable [root@ovirt-hyp-01 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/centos_ovirt--hyp--01-root 50G 4.1G 46G 9% / devtmpfs 7.7G 0 7.7G 0% /dev tmpfs 7.8G 0 7.8G 0% /dev/shm tmpfs 7.8G 8.7M 7.7G 1% /run tmpfs 7.8G 0 7.8G 0% /sys/fs/cgroup /dev/mapper/centos_ovirt--hyp--01-home 61G 33M 61G 1% /home /dev/mapper/gluster_vg_sdb-gluster_lv_engine 50G 7.6G 43G 16% /gluster_bricks/engine /dev/mapper/gluster_vg_sdb-gluster_lv_data 730G 157G 574G 22% /gluster_bricks/data /dev/sda1 497M 173M 325M 35% /boot ovirt-hyp-01.example.lan:engine 50G 7.6G 43G 16% /rhev/data-center/mnt/glusterSD/ovirt-hyp-01.example.lan:engine tmpfs 1.6G 0 1.6G 0% /run/user/0 [root@ovirt-hyp-01 ~]# systemctl list-unit-files|grep ovirt ovirt-ha-agent.service enabled ovirt-ha-broker.service enabled ovirt-imageio-daemon.service disabled ovirt-vmconsole-host-sshd.service enabled [root@ovirt-hyp-01 ~]# systemctl status ovirt-ha-agent.service ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2017-06-15 08:56:15 EDT; 21min ago Main PID: 3150 (ovirt-ha-agent) CGroup: /system.slice/ovirt-ha-agent.service └─3150 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon Jun 15 08:56:15 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Jun 15 08:56:15 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted Engine High Availability Monitoring Agent... Jun 15 09:17:18 ovirt-hyp-01.example.lan ovirt-ha-agent[3150]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Engine VM stopped on localhost [root@ovirt-hyp-01 ‾]# systemctl status ovirt-ha-broker.service ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2017-06-15 08:54:06 EDT; 24min ago Main PID: 968 (ovirt-ha-broker) CGroup: /system.slice/ovirt-ha-broker.service └─968 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon Jun 15 08:54:06 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted Engine High Availability Communications Broker. Jun 15 08:54:06 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted Engine High Availability Communications Broker... Jun 15 08:56:16 ovirt-hyp-01.example.lan ovirt-ha-broker[968]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.listener.ConnectionHandler ERROR Error handling request, data: '...1b55bcf76' Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/ovirt... Hint: Some lines were ellipsized, use -l to show in full. [root@ovirt-hyp-01 ‾]# systemctl restart ovirt-ha-agent.service [root@ovirt-hyp-01 ‾]# systemctl status ovirt-ha-agent.service ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2017-06-15 09:19:21 EDT; 26s ago Main PID: 8563 (ovirt-ha-agent) CGroup: /system.slice/ovirt-ha-agent.service └─8563 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent --no-daemon Jun 15 09:19:21 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Jun 15 09:19:21 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted Engine High Availability Monitoring Agent... [root@ovirt-hyp-01 ‾]# systemctl restart ovirt-ha-broker.service [root@ovirt-hyp-01 ‾]# systemctl status ovirt-ha-broker.service ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled) Active: active (running) since Thu 2017-06-15 09:20:59 EDT; 28s ago Main PID: 8844 (ovirt-ha-broker) CGroup: /system.slice/ovirt-ha-broker.service └─8844 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker --no-daemon Jun 15 09:20:59 ovirt-hyp-01.example.lan systemd[1]: Started oVirt Hosted Engine High Availability Communications Broker. Jun 15 09:20:59 ovirt-hyp-01.example.lan systemd[1]: Starting oVirt Hosted Engine High Availability Communications Broker... On Jun 14, 2017 4:45 AM, "Sahina Bose" <sab...@redhat.com> wrote: > What's the output of "hosted-engine --vm-status" and "gluster volume > status engine" tell you? Are all the bricks running as per gluster vol > status? > > Can you try to restart the ovirt-ha-agent and ovirt-ha-broker services? > > If HE still has issues powering up, please provide agent.log and > broker.log from /var/log/ovirt-hosted-engine-ha and gluster mount logs > from /var/log/glusterfs/rhev-data-center-mnt <engine>.log > > On Thu, Jun 8, 2017 at 6:57 PM, Joel Diaz <mrjoeld...@gmail.com> wrote: > >> Good morning oVirt community, >> >> I'm running a three host gluster environment with hosted engine. >> >> Yesterday the engine went down and has not been able to come up properly. >> It tries to start on all three host. >> >> I have two gluster volumes, data and engne. The data storage domian >> volume is no longer mounted but the engine volume is up. I've restarted the >> gluster service and make sure both volumes were running. The data volume >> will not mount. >> >> How can I get the engine running properly again? >> >> Thanks, >> >> Joel >> >> _______________________________________________ >> Users mailing list >> Users@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users