----- Original Message ----- > From: "Cong Yue" <cong_...@alliedtelesis.com> > To: "Simone Tiraboschi" <stira...@redhat.com> > Cc: users@ovirt.org > Sent: Friday, December 19, 2014 9:25:32 PM > Subject: RE: [ovirt-users] VM failover with ovirt3.5 > > In the documentation of > http://www.ovirt.org/OVirt_Administration_Guide#.E2.81.A0Improving_Uptime_with_Virtual_Machine_High_Availability > it says > To enable the migration of highly available virtual machines: > Power management must be configured for the hosts running the highly > available virtual machines.
hosted-engine and VM HA are note really the same feature cause other VMs can be managed by the engine while engine VM itself cannot (chicken-and-egg problem) > Does this mean I need to confirgure all poer management for all ovirt nodes? No, it's not mandatory for hosted engine but it would be better to do so just for power management itself. > Thanks, > Cong > > -----Original Message----- > From: Yue, Cong > Sent: Friday, December 19, 2014 10:22 AM > To: 'Simone Tiraboschi' > Cc: users@ovirt.org > Subject: RE: [ovirt-users] VM failover with ovirt3.5 > > Thanks for the information. This is the log for my three ovirt nodes. > From the output of hosted-engine --vm-status, it shows the engine state for > my 2nd and 3rd ovirt node is DOWN. > Is this the reason why VM failover not work in my environment? How can I make > also engine works for my 2nd and 3rd ovit nodes? > -- > --== Host 1 status ==-- > > Status up-to-date : True > Hostname : 10.0.0.94 > Host ID : 1 > Engine status : {"health": "good", "vm": "up", > "detail": "up"} > Score : 2400 > Local maintenance : False > Host timestamp : 150475 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=150475 (Fri Dec 19 13:12:18 2014) > host-id=1 > score=2400 > maintenance=False > state=EngineUp > > > --== Host 2 status ==-- > > Status up-to-date : True > Hostname : 10.0.0.93 > Host ID : 2 > Engine status : {"reason": "vm not running on > this host", "health": "bad", "vm": "down", "detail": "unknown"} > Score : 2400 > Local maintenance : False > Host timestamp : 1572 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=1572 (Fri Dec 19 10:12:18 2014) > host-id=2 > score=2400 > maintenance=False > state=EngineDown > > > --== Host 3 status ==-- > > Status up-to-date : False > Hostname : 10.0.0.92 > Host ID : 3 > Engine status : unknown stale-data > Score : 2400 > Local maintenance : False > Host timestamp : 987 > Extra metadata (valid at timestamp): > metadata_parse_version=1 > metadata_feature_version=1 > timestamp=987 (Fri Dec 19 10:09:58 2014) > host-id=3 > score=2400 > maintenance=False > state=EngineDown > > -- > And the /var/log/ovirt-hosted-engine-ha/agent.log for three ovirt nodes are > as follows: > -- > 10.0.0.94(hosted-engine-1) > --- > MainThread::INFO::2014-12-19 > 13:09:33,716::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:09:33,716::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:09:44,017::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:09:44,017::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:09:54,303::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:09:54,303::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:10:04,342::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) > Engine vm running on localhost > MainThread::INFO::2014-12-19 > 13:10:04,617::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:10:04,617::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:10:14,657::state_machine::160::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh) > Global metadata: {'maintenance': False} > MainThread::INFO::2014-12-19 > 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh) > Host 10.0.0.93 (id 2): {'extra': > 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=1448 > (Fri Dec 19 10:10:14 > 2014)\nhost-id=2\nscore=2400\nmaintenance=False\nstate=EngineDown\n', > 'hostname': '10.0.0.93', 'alive': True, 'host-id': 2, 'engine-status': > {'reason': 'vm not running on this host', 'health': 'bad', 'vm': > 'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False, > 'host-ts': 1448} > MainThread::INFO::2014-12-19 > 13:10:14,657::state_machine::165::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh) > Host 10.0.0.92 (id 3): {'extra': > 'metadata_parse_version=1\nmetadata_feature_version=1\ntimestamp=987 > (Fri Dec 19 10:09:58 > 2014)\nhost-id=3\nscore=2400\nmaintenance=False\nstate=EngineDown\n', > 'hostname': '10.0.0.92', 'alive': True, 'host-id': 3, 'engine-status': > {'reason': 'vm not running on this host', 'health': 'bad', 'vm': > 'down', 'detail': 'unknown'}, 'score': 2400, 'maintenance': False, > 'host-ts': 987} > MainThread::INFO::2014-12-19 > 13:10:14,658::state_machine::168::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(refresh) > Local (id 1): {'engine-health': {'health': 'good', 'vm': 'up', > 'detail': 'up'}, 'bridge': True, 'mem-free': 1079.0, 'maintenance': > False, 'cpu-load': 0.0269, 'gateway': True} > MainThread::INFO::2014-12-19 > 13:10:14,904::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:10:14,904::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:10:25,210::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:10:25,210::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:10:35,499::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:10:35,499::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:10:45,784::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:10:45,785::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:10:56,070::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:10:56,070::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:11:06,109::states::394::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume) > Engine vm running on localhost > MainThread::INFO::2014-12-19 > 13:11:06,359::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:11:06,359::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:11:16,658::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:11:16,658::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:11:26,991::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:11:26,991::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > MainThread::INFO::2014-12-19 > 13:11:37,341::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineUp (score: 2400) > MainThread::INFO::2014-12-19 > 13:11:37,341::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.93 (id: 2, score: 2400) > ---- > > 10.0.0.93 (hosted-engine-2) > MainThread::INFO::2014-12-19 > 10:12:18,339::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineDown (score: 2400) > MainThread::INFO::2014-12-19 > 10:12:18,339::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.94 (id: 1, score: 2400) > MainThread::INFO::2014-12-19 > 10:12:28,651::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineDown (score: 2400) > MainThread::INFO::2014-12-19 > 10:12:28,652::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.94 (id: 1, score: 2400) > MainThread::INFO::2014-12-19 > 10:12:39,010::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineDown (score: 2400) > MainThread::INFO::2014-12-19 > 10:12:39,010::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.94 (id: 1, score: 2400) > MainThread::INFO::2014-12-19 > 10:12:49,338::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineDown (score: 2400) > MainThread::INFO::2014-12-19 > 10:12:49,338::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.94 (id: 1, score: 2400) > MainThread::INFO::2014-12-19 > 10:12:59,642::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineDown (score: 2400) > MainThread::INFO::2014-12-19 > 10:12:59,642::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.94 (id: 1, score: 2400) > MainThread::INFO::2014-12-19 > 10:13:10,010::hosted_engine::327::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Current state EngineDown (score: 2400) > MainThread::INFO::2014-12-19 > 10:13:10,010::hosted_engine::332::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring) > Best remote host 10.0.0.94 (id: 1, score: 2400) > > > 10.0.0.92(hosted-engine-3) > same as 10.0.0.93 > -- > > -----Original Message----- > From: Simone Tiraboschi [mailto:stira...@redhat.com] > Sent: Friday, December 19, 2014 12:28 AM > To: Yue, Cong > Cc: users@ovirt.org > Subject: Re: [ovirt-users] VM failover with ovirt3.5 > > > > ----- Original Message ----- > > From: "Cong Yue" <cong_...@alliedtelesis.com> > > To: users@ovirt.org > > Sent: Friday, December 19, 2014 2:14:33 AM > > Subject: [ovirt-users] VM failover with ovirt3.5 > > > > > > > > Hi > > > > > > > > In my environment, I have 3 ovirt nodes as one cluster. And on top of > > host-1, there is one vm to host ovirt engine. > > > > Also I have one external storage for the cluster to use as data domain > > of engine and data. > > > > I confirmed live migration works well in my environment. > > > > But it seems very buggy for VM failover if I try to force to shut down > > one ovirt node. Sometimes the VM in the node which is shutdown can > > migrate to other host, but it take more than several minutes. > > > > Sometimes, it can not migrate at all. Sometimes, only when the host is > > back, the VM is beginning to move. > > Can you please check or share the logs under /var/log/ovirt-hosted-engine-ha/ > ? > > > Is there some documentation to explain how VM failover is working? And > > is there some bugs reported related with this? > > http://www.ovirt.org/Features/Self_Hosted_Engine#Agent_State_Diagram > > > Thanks in advance, > > > > Cong > > > > > > > > > > This e-mail message is for the sole use of the intended recipient(s) > > and may contain confidential and privileged information. Any > > unauthorized review, use, disclosure or distribution is prohibited. If > > you are not the intended recipient, please contact the sender by reply > > e-mail and destroy all copies of the original message. If you are the > > intended recipient, please be advised that the content of this message > > is subject to access, review and disclosure by the sender's e-mail System > > Administrator. > > > > _______________________________________________ > > Users mailing list > > Users@ovirt.org > > http://lists.ovirt.org/mailman/listinfo/users > > > > This e-mail message is for the sole use of the intended recipient(s) and may > contain confidential and privileged information. Any unauthorized review, > use, disclosure or distribution is prohibited. If you are not the intended > recipient, please contact the sender by reply e-mail and destroy all copies > of the original message. If you are the intended recipient, please be > advised that the content of this message is subject to access, review and > disclosure by the sender's e-mail System Administrator. > _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users