[ovirt-users] ovirt change of email alert

Alex K Sun, 18 Feb 2018 03:25:30 -0800

Hi all,

I had put a specif email alert during the deploy and then I wanted to
change it.
I did the following:


At one of the hosts ra:

hosted-engine --set-shared-config destination-emails ale...@domain.com
--type=broker

systemctl restart ovirt-ha-broker.service

I had to do the above since changing the email from GUI did not have any
effect.

After the above the emails are received at the new email address but the
cluster seems to have some issue recognizing the state of engine. i am
flooded with emails that " EngineMaybeAway-EngineUnexpectedlyDown "

I have restarted at each host also the ovirt-ha-agent.service.
Did put the cluster to global maintenance and then disabled global
maintenance.

host agent logs I have:

MainThread::ERROR::2018-02-18
11:12:20,751::hosted_engine::720::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_sanlock)
cannot get lock on host id 1: host already holds lock on a different host id

One other host logs:
MainThread::INFO::2018-02-18
11:20:23,692::states::682::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(score)
Score is 0 due to unexpected vm shutdown at Sun Feb 18 11:15:13 2018
MainThread::INFO::2018-02-18
11:20:23,692::hosted_engine::453::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUnexpectedlyDown (score: 0)

The engine status on 3 hosts is:
hosted-engine --vm-status


--== Host 1 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : v0
Host ID                            : 1
Engine status                      : {"reason": "vm not running on this
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 0
stopped                            : False
Local maintenance                  : False
crc32                              : cfd15dac
local_conf_timestamp               : 4721144
Host timestamp                     : 4721144
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=4721144 (Sun Feb 18 11:20:33 2018)
        host-id=1
        score=0
        vm_conf_refresh_time=4721144 (Sun Feb 18 11:20:33 2018)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUnexpectedlyDown
        stopped=False
        timeout=Tue Feb 24 15:29:44 1970


--== Host 2 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : True
Hostname                           : v1
Host ID                            : 2
Engine status                      : {"reason": "vm not running on this
host", "health": "bad", "vm": "down", "detail": "unknown"}
Score                              : 0
stopped                            : False
Local maintenance                  : False
crc32                              : 5cbcef4c
local_conf_timestamp               : 2499416
Host timestamp                     : 2499416
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=2499416 (Sun Feb 18 11:20:46 2018)
        host-id=2
        score=0
        vm_conf_refresh_time=2499416 (Sun Feb 18 11:20:46 2018)
        conf_on_shared_storage=True
        maintenance=False
        state=EngineUnexpectedlyDown
        stopped=False
        timeout=Thu Jan 29 22:18:42 1970


--== Host 3 status ==--

conf_on_shared_storage             : True
Status up-to-date                  : False
Hostname                           : v2
Host ID                            : 3
Engine status                      : unknown stale-data
Score                              : 3400
stopped                            : False
Local maintenance                  : False
crc32                              : f064d529
local_conf_timestamp               : 2920612
Host timestamp                     : 2920611
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=2920611 (Sun Feb 18 10:47:31 2018)
        host-id=3
        score=3400
        vm_conf_refresh_time=2920612 (Sun Feb 18 10:47:32 2018)
        conf_on_shared_storage=True
        maintenance=False
        state=GlobalMaintenance
        stopped=False


Putting each host at maintenance then activating them back does not resolve
the issue. Seems I have to avoid defining email address during deploy and
have it set only later at GUI.

How one can recover from this situation?


Thanx,
Alex

_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] ovirt change of email alert

Reply via email to