On 04/09/2014 02:32 PM, Daniel Helgenberger wrote:
On Mi, 2014-04-09 at 09:18 +0200, Jiri Moskovcak wrote:
On 04/08/2014 06:09 PM, Daniel Helgenberger wrote:
Hello,
I have an oVirt 3.4 hosted engine lab setup witch I am evaluating for
production use.
I "simulated" an ungraceful shutdown of all HA nodes (powercut) while
the engine was running. After powering up, the system did not recover
itself (it seemed).
I had to restart the ovirt-hosted-ha service (witch was in a locked
state) and then manually run 'hosted-engine --vm-start'.
What is the supposed procedure after a shutdown (graceful / ungraceful)
of Hosted-Engine HA nodes? Should the engine recover by itself? Should
the running VM's be restarted automatically?
When this happens the agent should start the engine VM and the engine
should take care of restarting the VMs which were running on that
restarted host and are marked as HA. Can you please provide contents ov
/var/log/ovirt* from the host after the powercut when the engine VM
doesn't come up?
Hello Jirka,
I accidentally already send the message without pointing out the
interesting part; this is:
<<< start logging ha-agent after reboot:
/var/log/ovirt-hosted-engine-ha/agent.log:MainTMainThread::INFO::2014-04-08
15:53:33,862::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
ovirt-hosted-engine-ha agent 1.1.2-1 started
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,936::hosted_engine::223::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_get_hostname)
Found certificate common name: 192.168.50.201
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,937::hosted_engine::363::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
Initializing ha-broker connection
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,937::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor ping, options {'addr': '192.168.50.1'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,939::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700911299600
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:33,939::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor mgmt-bridge, options {'use_ssl': 'true', 'bridge_name':
'ovirtmgmt', 'address': '0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,013::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700911300304
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,013::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor mem-free, options {'use_ssl': 'true', 'address': '0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,015::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700911300112
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,015::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor cpu-load-no-engine, options {'use_ssl': 'true', 'vm_uuid':
'e68a11c8-1251-4c13-9e3b-3847bbb4fa3d', 'address': '0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,018::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700911300240
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,018::brokerlink::126::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Starting monitor engine-health, options {'use_ssl': 'true', 'vm_uuid':
'e68a11c8-1251-4c13-9e3b-3847bbb4fa3d', 'address': '0'}
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,024::brokerlink::137::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(start_monitor)
Success, id 139700723857104
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,024::hosted_engine::386::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_broker)
Broker initialized, all submonitors started
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:53:34,312::hosted_engine::430::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_cond_start_service)
Starting vdsmd
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::CRITICAL::2014-04-08
15:53:34,442::agent::103::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Could
not start ha-agent
(10 min nothing)
<<< here I did a 'service ovirt-hosted-ha start'
/var/log/ovirt-hosted-engine-ha/agent.log:MainThread::INFO::2014-04-08
15:59:16,698::agent::52::ovirt_hosted_engine_ha.agent.agent.Agent::(run)
ovirt-hosted-engine-ha agent 1.1.2-1 started
....
after this things went quite smoothly.
Hi Daniel,
I noticed that in the log and I was just about to ask if that's when you
manually fixed it. Is there something else around that time in
/var/log/message which might be related to it?
Thanks,
Jirka
Thanks,
Jirka
Thanks,
Daniel
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users