[ 
https://issues.apache.org/jira/browse/CLOUDSTACK-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13692981#comment-13692981
 ] 

Prasanna Santhanam commented on CLOUDSTACK-3137:
------------------------------------------------

The deployment uses Xenserver so there are no logs like KVM on the agent side
                
> systemvms HA fails on latest master (4591f94)
> ---------------------------------------------
>
>                 Key: CLOUDSTACK-3137
>                 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-3137
>             Project: CloudStack
>          Issue Type: Bug
>      Security Level: Public(Anyone can view this level - this is the 
> default.) 
>            Reporter: Prasanna Santhanam
>            Priority: Blocker
>         Attachments: logs.tar.bz2
>
>
> SystemVMs fail to start during HA on latest master 
> (4591f94a0bb69875dca769c9738998d3b5b96106)
> It looks like the HA of system VMs is failing. On our automated test 
> environment we reboot the management server before the systemVMs spin up to 
> ensure that the global settings are affected and the configuration for the 
> secstorage.internal.sites is rightly set in the secondary storage VM. The 
> management server upon reboot discovers that SSVM was put to perform HA in 
> the op_ha_work table and starts to re-spin the SSVM. At this time the CPVM is 
> also booting up. No further activity is observed after two HA attempts.
> SystemVM start before Management server start: 
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,738 DEBUG 
> [cloud.vm.VirtualMachineManagerImpl] (secstorage-1:null) Allocating entries 
> for VM: VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,771 DEBUG 
> [cloud.vm.VirtualMachineManagerImpl] (secstorage-1:null) Allocating nics for 
> VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,774 DEBUG 
> [cloud.network.NetworkManagerImpl] (secstorage-1:null) Allocating nic for vm 
> VM[SecondaryStorageVm|s-1-VM] in network Ntwk[200|Public|1] with requested 
> profile NicProfile[0-0-null-null-null
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,843 DEBUG 
> [cloud.network.NetworkManagerImpl] (secstorage-1:null) Allocating nic for vm 
> VM[SecondaryStorageVm|s-1-VM] in network Ntwk[202|Control|3] with requested 
> profile null
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,870 DEBUG 
> [cloud.network.NetworkManagerImpl] (secstorage-1:null) Allocating nic for vm 
> VM[SecondaryStorageVm|s-1-VM] in network Ntwk[201|Management|2] with 
> requested profile null
> Jun 24 22:30:49 cloudstack-centos63 local0: 2013-06-25 05:30:49,916 DEBUG 
> [cloud.network.NetworkManagerImpl] (secstorage-1:null) Allocating nic for vm 
> VM[SecondaryStorageVm|s-1-VM] in network Ntwk[203|Storage|4] with requested 
> profile null
> ...
> ...
> Jun 24 22:30:52 cloudstack-centos63 local0: 2013-06-25 05:30:52,552 DEBUG 
> [agent.transport.Request] (consoleproxy-1:null) Seq 3-2067660811: Sending  { 
> Cmd , MgmtId: 200973787296321, via: 3, Ver: v1, Flags: 100111, 
> [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"path":"template/tmpl/1/1/","origUrl":"http://nfs/templates/acton/acton-systemvm-02062012.vhd.bz2","uuid":"1","id":1,"format":"VHD","accountId":1,"checksum":"f613f38c96bf039f2e5cbf92fa8ad4f8","hvm":false,"displayText":"SystemVM
>  Template 
> (XenServer)","imageDataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"nfs://nfs.fmt.vmops.com:/expo...
> Jun 24 22:31:01 cloudstack-centos63 local0: 2013-06-25 05:31:01,824 INFO  
> [context.support.XmlWebApplicationContext] (Thread-26:null) Closing Root 
> WebApplicationContext: startup date [Mon Jun 24 23:25:50 UTC 2013]; root of 
> context hierarchy
> ^^^^^^^^^^ Management Server Restarts ^^^^^^^^^^^^^^
> After restart SSVM scheduled for HA
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,394 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (Timer-1:null) Schedule vm for HA:  
> VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,455 INFO  
> [cloud.vm.VirtualMachineManagerImpl] (Timer-1:null) Handling unfinished work 
> item: ItWork[1805a89b-d03a-4f1f-b717-ac9f5da6edea-Starting-2-Prepare]
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,504 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (Timer-1:null) Schedule vm for HA:  
> VM[ConsoleProxy|v-2-VM]
> ...
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,540 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:null) Starting work
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,547 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-1:null) Starting work
> ...
> Jun 24 22:32:52 cloudstack-centos63 local0: 2013-06-25 05:32:52,815 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-1) Processing 
> HAWork[1-HA-1-Starting-Investigating]
> ...
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,289 DEBUG 
> [cloud.ha.CheckOnAgentInvestigator] (HA-Worker-0:work-2) Unable to reach the 
> agent for VM[ConsoleProxy|v-2-VM]: Resource [Host:3] is unreachable: Host 3: 
> Host with specified id is not in the right state: Disconnected
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,290 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:work-2) 
> SimpleInvestigator found VM[ConsoleProxy|v-2-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,293 DEBUG 
> [cloud.ha.CheckOnAgentInvestigator] (HA-Worker-2:work-1) Unable to reach the 
> agent for VM[SecondaryStorageVm|s-1-VM]: Resource [Host:2] is unreachable: 
> Host 2: Host with specified id is not in the right state: Disconnected
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,298 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-1) 
> SimpleInvestigator found VM[SecondaryStorageVm|s-1-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,336 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:work-2) 
> XenServerInvestigator found VM[ConsoleProxy|v-2-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,338 DEBUG 
> [cloud.ha.UserVmDomRInvestigator] (HA-Worker-0:work-2) Not a User Vm, unable 
> to determine state of VM[ConsoleProxy|v-2-VM] returning null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,338 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:work-2) null found 
> VM[ConsoleProxy|v-2-VM]to be alive? null
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,338 DEBUG 
> [cloud.ha.ManagementIPSystemVMInvestigator] (HA-Worker-0:work-2) Testing if 
> VM[ConsoleProxy|v-2-VM] is alive
> ....
> Rescheduled HA
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,623 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-0:work-2) Rescheduling 
> HAWork[2-HA-2-Starting-Investigating] to try again at Tue Jun 25 05:43:07 UTC 
> 2013
> Jun 24 22:32:53 cloudstack-centos63 local0: 2013-06-25 05:32:53,626 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-1) Rescheduling 
> HAWork[1-HA-1-Starting-Investigating] to try again at Tue Jun 25 05:43:07 UTC 
> 2013
> ...
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,586 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-1) Processing 
> HAWork[1-HA-1-Starting-Investigating]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,592 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-1) HA on 
> VM[SecondaryStorageVm|s-1-VM]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,593 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-1) VM 
> VM[SecondaryStorageVm|s-1-VM] has been changed.  Current State = Starting 
> Previous State = Starting last updated = 6 previous updated = 2
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,593 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-3:work-1) Completed 
> HAWork[1-HA-1-Starting-Investigating]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,611 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-2) Processing 
> HAWork[2-HA-2-Starting-Investigating]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,618 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-2) HA on 
> VM[ConsoleProxy|v-2-VM]
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,618 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-2) VM 
> VM[ConsoleProxy|v-2-VM] has been changed.  Current State = Starting Previous 
> State = Starting last updated = 6 previous updated = 2
> Jun 24 22:43:11 cloudstack-centos63 local0: 2013-06-25 05:43:11,618 INFO  
> [cloud.ha.HighAvailabilityManagerImpl] (HA-Worker-2:work-2) Completed 
> HAWork[2-HA-2-Starting-Investigating]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to