Hi! I'm using NFS at primary and secondary storage. I did graceceful shutdown at KVM host A.
Thanks Enviado via iPhone Luciano > Em 17/07/2015, às 19:00, Milamber <milam...@apache.org> escreveu: > > > >> On 17/07/2015 21:23, Somesh Naidu wrote: >> Ok, so here are my findings. >> >> 1. Host ID 3 was shutdown around 2015-07-16 12:19:09 at which point >> management server called a disconnect. >> 2. Based on the logs, it seems VM IDs 32, 18, 39 and 46 were running on the >> host. >> 3. No HA tasks for any of these VMs at this time. >> 5. Management server restarted at around 2015-07-16 12:30:20. >> 6. Host ID 3 connected back at around 2015-07-16 12:44:08. >> 7. Management server identified the missing VMs and triggered HA on those. >> 8. The VMs were eventually started, all 4 of them. >> >> I am not 100% sure why HA wasn't triggered until 2015-07-16 12:30 (#3), but >> I know that management server restart caused it not happen until the host >> was reconnected. > > Perhaps, the management server don't reconize the host 3 totally down (ping > alive? or some quorum don't ok) > The only way to the mgt server to accept totally that the host 3 has a real > problem that the host 3 has been reboot (around 12:44)? > > What is the storage subsystem? CLVMd? > > >> >> Regards, >> Somesh >> >> >> -----Original Message----- >> From: Luciano Castro [mailto:luciano.cas...@gmail.com] >> Sent: Friday, July 17, 2015 12:13 PM >> To: users@cloudstack.apache.org >> Subject: Re: HA feature - KVM - CloudStack 4.5.1 >> >> No problems Somesh, thanks for your help. >> >> Link of log: >> >> https://dl.dropboxusercontent.com/u/6774061/management-server.log.2015-07-16.gz >> >> Luciano >> >> On Fri, Jul 17, 2015 at 12:00 PM, Somesh Naidu <somesh.na...@citrix.com> >> wrote: >> >>> How large is the management server logs dated 2015-07-16? I would like to >>> review the logs. All the information I need from that incident should be in >>> there so I don't need any more testing. >>> >>> Regards, >>> Somesh >>> >>> -----Original Message----- >>> From: Luciano Castro [mailto:luciano.cas...@gmail.com] >>> Sent: Friday, July 17, 2015 7:58 AM >>> To: users@cloudstack.apache.org >>> Subject: Re: HA feature - KVM - CloudStack 4.5.1 >>> >>> Hi Somesh! >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> [root@1q2 ~]# zgrep -i -E >>> >>> 'SimpleIvestigator|KVMInvestigator|PingInvestigator|ManagementIPSysVMInvestigator' >>> /var/log/cloudstack/management/management-server.log.2015-07-16.gz |tail >>> -5000 > /tmp/management.txt >>> [root@1q2 ~]# cat /tmp/management.txt >>> 2015-07-16 12:30:45,452 DEBUG [o.a.c.s.l.r.ExtensionRegistry] (main:null) >>> Registering extension [KVMInvestigator] in [Ha Investigators Registry] >>> 2015-07-16 12:30:45,452 DEBUG [o.a.c.s.l.r.RegistryLifecycle] (main:null) >>> Registered com.cloud.ha.KVMInvestigator@57ceec9a >>> 2015-07-16 12:30:45,927 DEBUG [o.a.c.s.l.r.ExtensionRegistry] (main:null) >>> Registering extension [PingInvestigator] in [Ha Investigators Registry] >>> 2015-07-16 12:30:45,928 DEBUG [o.a.c.s.l.r.ExtensionRegistry] (main:null) >>> Registering extension [ManagementIPSysVMInvestigator] in [Ha Investigators >>> Registry] >>> 2015-07-16 12:30:53,796 INFO [o.a.c.s.l.r.DumpRegistry] (main:null) >>> Registry [Ha Investigators Registry] contains [SimpleInvestigator, >>> XenServerInvestigator, KVMInv >>> >>> I searched this log before, but as I thought that had not nothing >>> special. >>> >>> If you want propose to me another scenario of test, I can do it. >>> >>> Thanks >>> >>> >>> On Thu, Jul 16, 2015 at 7:27 PM, Somesh Naidu <somesh.na...@citrix.com> >>> wrote: >>> >>>> What about other investigators, specifically " KVMInvestigator, >>>> PingInvestigator"? They report the VMs as alive=false too? >>>> >>>> Also, it is recommended that you look at the management-sever.log instead >>>> of catalina.out (for one, the latter doesn’t have timestamp). >>>> >>>> Regards, >>>> Somesh >>>> >>>> >>>> -----Original Message----- >>>> From: Luciano Castro [mailto:luciano.cas...@gmail.com] >>>> Sent: Thursday, July 16, 2015 1:14 PM >>>> To: users@cloudstack.apache.org >>>> Subject: Re: HA feature - KVM - CloudStack 4.5.1 >>>> >>>> Hi Somesh! >>>> >>>> >>>> thanks for help.. I did again ,and I collected new logs: >>>> >>>> My vm_instance name is i-2-39-VM. There was some routers in KVM host 'A' >>>> (this one that I powered off now): >>>> >>>> >>>> [root@1q2 ~]# grep -i -E 'SimpleInvestigator.*false' >>>> /var/log/cloudstack/management/catalina.out >>>> INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-2:ctx-e2f91c9c >>> work-3) >>>> SimpleInvestigator found VM[DomainRouter|r-4-VM]to be alive? false >>>> INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-1:ctx-729acf4f >>> work-7) >>>> SimpleInvestigator found VM[User|i-23-33-VM]to be alive? false >>>> INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-4:ctx-a66a4941 >>> work-8) >>>> SimpleInvestigator found VM[DomainRouter|r-36-VM]to be alive? false >>>> INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-1:ctx-5977245e >>>> work-10) SimpleInvestigator found VM[User|i-17-26-VM]to be alive? false >>>> INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-1:ctx-c7f39be0 >>> work-9) >>>> SimpleInvestigator found VM[DomainRouter|r-32-VM]to be alive? false >>>> INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-3:ctx-ad4f5fda >>>> work-10) SimpleInvestigator found VM[DomainRouter|r-46-VM]to be alive? >>>> false >>>> INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-0:ctx-0257f5af >>>> work-11) SimpleInvestigator found VM[User|i-4-52-VM]to be alive? false >>>> INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-4:ctx-7ddff382 >>>> work-12) SimpleInvestigator found VM[DomainRouter|r-32-VM]to be alive? >>>> false >>>> INFO [c.c.h.HighAvailabilityManagerImpl] (HA-Worker-1:ctx-9f79917e >>>> work-13) SimpleInvestigator found VM[User|i-2-39-VM]to be alive? false >>>> >>>> >>>> >>>> KVM host 'B' agent log (where the machine would be migrate): >>>> >>>> 2015-07-16 16:58:56,537 INFO [kvm.resource.LibvirtComputingResource] >>>> (agentRequest-Handler-4:null) Live migration of instance i-2-39-VM >>>> initiated >>>> 2015-07-16 16:58:57,540 INFO [kvm.resource.LibvirtComputingResource] >>>> (agentRequest-Handler-4:null) Waiting for migration of i-2-39-VM to >>>> complete, waited 1000ms >>>> 2015-07-16 16:58:58,541 INFO [kvm.resource.LibvirtComputingResource] >>>> (agentRequest-Handler-4:null) Waiting for migration of i-2-39-VM to >>>> complete, waited 2000ms >>>> 2015-07-16 16:58:59,542 INFO [kvm.resource.LibvirtComputingResource] >>>> (agentRequest-Handler-4:null) Waiting for migration of i-2-39-VM to >>>> complete, waited 3000ms >>>> 2015-07-16 16:59:00,543 INFO [kvm.resource.LibvirtComputingResource] >>>> (agentRequest-Handler-4:null) Waiting for migration of i-2-39-VM to >>>> complete, waited 4000ms >>>> 2015-07-16 16:59:01,245 INFO [kvm.resource.LibvirtComputingResource] >>>> (agentRequest-Handler-4:null) Migration thread for i-2-39-VM is done >>>> >>>> It said done for my i-2-39-VM instance, but I can´t ping this host. >>>> >>>> Luciano >>> >>> >>> -- >>> Luciano Castro >