I just tried to deploy a fresh 4.1 zone from current 4.1.2, I can't deploy initial system vms. It immediately takes out a file lock on the system vm template, and then gets stuck. if I remove the lock, restart the agent, it immediately goes back into this state:
AGENT (after deleting template/tmpl/1/3/KVMFILELOCK dir): 2013-08-07 16:53:41,759 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) Didn't find an existing storage pool 2a9c418e-dd56-3f2c-9081-032b877d5644 by UUID, checking for pools with duplicate paths 2013-08-07 16:53:41,766 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) Checking path of existing pool 2fa10069-1fb3-4eea-8dc4-f773e7be0d19 against pool we want to create 2013-08-07 16:53:41,774 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) Attempting to create storage pool 2a9c418e-dd56-3f2c-9081-032b877d5644 2013-08-07 16:53:41,774 DEBUG [kvm.storage.LibvirtStorageAdaptor] (agentRequest-Handler-1:null) <pool type='netfs'> <name>2a9c418e-dd56-3f2c-9081-032b877d5644</name> <uuid>2a9c418e-dd56-3f2c-9081-032b877d5644</uuid> <source> <host name='172.17.10.10'/> <dir path='/nfs/secondary/template/tmpl/1/3'/> </source> <target> <path>/mnt/2a9c418e-dd56-3f2c-9081-032b877d5644</path> </target> </pool> 2013-08-07 16:53:42,024 DEBUG [cloud.agent.Agent] (agentRequest-Handler-1:null) Seq 1-971702293: { Ans: , MgmtId: 90520735907663, via: 1, Ver: v1, Flags: 110, [{"storage.PrimaryStorageDownloadAnswer":{"templateSize":0,"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException: Can not get file lock to refresh the pool 2fa10069-1fb3-4eea-8dc4-f773e7be0d19","wait":0}}] } Now KVMFILELOCK dir exists. On Wed, Aug 7, 2013 at 2:13 PM, Rayees Namathponnan (JIRA) <j...@apache.org> wrote: > > [ > https://issues.apache.org/jira/browse/CLOUDSTACK-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732678#comment-13732678 > ] > > Rayees Namathponnan commented on CLOUDSTACK-2729: > ------------------------------------------------- > > Need few more automation runs result to verify this\, i will close this after > verifying next two runs > >> [Automation] Libvirt failed find primary storage and VM deployment failed >> ---------------------------------------------------------------------------- >> >> Key: CLOUDSTACK-2729 >> URL: https://issues.apache.org/jira/browse/CLOUDSTACK-2729 >> Project: CloudStack >> Issue Type: Bug >> Security Level: Public(Anyone can view this level - this is the >> default.) >> Components: Storage Controller >> Affects Versions: 4.2.0 >> Environment: Master branch build >> Automation environment - KVM >> Reporter: Rayees Namathponnan >> Assignee: edison su >> Priority: Blocker >> Fix For: 4.2.0 >> >> Attachments: CLOUDSTACK-2729.rar >> >> >> Failed to deploy VM in automation, environment created with , 2 hosts and 2 >> primary storages in a cluster >> 1) /export/home/rayees/SC_QA_AUTO4/primary2 >> 2) /export/home/rayees/SC_QA_AUTO4/primary >> Libvirt failed to find the primary storage >> /export/home/rayees/SC_QA_AUTO4/primary and VM deployment failed >> MS log >> 2013-05-28 21:11:44,540 DEBUG [agent.transport.Request] >> (consoleproxy-1:null) Seq 4-936706756: Sending { Cmd , MgmtId: >> 29066118877352, via: 4, Ver: v1, Flags: 100111, >> [{"StartCommand":{"vm":{"id":29,"name":"v-29-QA","type":"ConsoleProxy","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":1073741824,"maxRam":1073741824,"arch":"x86_64","os":"Debian >> GNU/Linux 5.0 (32-bit)","bootArgs":" template=domP type=consoleproxy >> host=10.223.49.195 port=8250 name=v-29-QA premium=true zone=1 pod=1 >> guid=Proxy.29 proxy_vm=29 disable_rp_filter=true eth2ip=10.223.122.73 >> eth2mask=255.255.255.192 gateway=10.223.122.65 eth0ip=169.254.0.154 >> eth0mask=255.255.0.0 eth1ip=10.223.50.96 eth1mask=255.255.255.192 >> mgmtcidr=10.223.49.192/26 localgw=10.223.50.65 internaldns1=10.223.110.254 >> dns1=72.52.126.11","rebootOnCrash":false,"enableHA":false,"limitCpuUse":false,"vncPassword":"d099568827911cef","params":{},"uuid":"5a146833-6a8c-44e5-83c0-50f34accf513","disks":[{"id":32,"name":"ROOT-29","mountPoint":"/export/home/rayees/SC_QA_AUTO4/primary","path":"f6f8d865-e9c0-4188-8a33-6c6383ca5075","size":276406784,"type":"ROOT","storagePoolType":"NetworkFilesystem","storagePoolUuid":"fff90cb5-06dd-33b3-8815-d78c08ca01d9","deviceId":0}],"nics":[{"deviceId":2,"networkRateMbps":-1,"defaultNic":true,"uuid":"21175978-96bd-4160-8228-8ad15aa40c66","ip":"10.223.122.73","netmask":"255.255.255.192","gateway":"10.223.122.65","mac":"06:2e:5a:00:00:42","dns1":"72.52.126.11","broadcastType":"Vlan","type":"Public","broadcastUri":"vlan://1221","isolationUri":"vlan://1221","isSecurityGroupEnabled":false},{"deviceId":0,"networkRateMbps":-1,"defaultNic":false,"uuid":"e0651452-a76e-4564-96b2-3d5d51e9bcd6","ip":"169.254.0.154","netmask":"255.255.0.0","gateway":"169.254.0.1","mac":"0e:00:a9:fe:00:9a","broadcastType":"LinkLocal","type":"Control","isSecurityGroupEnabled":false},{"deviceId":1,"networkRateMbps":-1,"defaultNic":false,"uuid":"0765c229-468e-4dfd-8382-24ac49791a8d","ip":"10.223.50.96","netmask":"255.255.255.192","gateway":"10.223.50.65","mac":"06:a8:e8:00:00:1d","broadcastType":"Native","type":"Management","isSecurityGroupEnabled":false}]},"hostIp":"10.223.50.66","wait":0}},{"check.CheckSshCommand":{"ip":"169.254.0.154","port":3922,"interval":6,"retries":100,"name":"v-29-QA","wait":0}}] >> } >> 2013-05-28 21:11:44,552 DEBUG [agent.transport.Request] >> (AgentManager-Handler-1:null) Seq 4-936706756: Processing: { Ans: , MgmtId: >> 29066118877352, via: 4, Ver: v1, Flags: 110, >> [{"Answer":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException: >> org.libvirt.LibvirtException: Storage pool not found: no pool with matching >> uuid\n\tat >> com.cloud.hypervisor.kvm.storage.LibvirtStorageAdaptor.getStoragePool(LibvirtStorageAdaptor.java:380)\n\tat >> >> com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.getStoragePool(KVMStoragePoolManager.java:72)\n\tat >> >> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.createVbd(LibvirtComputingResource.java:3399)\n\tat >> >> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:3293)\n\tat >> >> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1187)\n\tat >> com.cloud.agent.Agent.processRequest(Agent.java:525)\n\tat >> com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:852)\n\tat >> com.cloud.utils.nio.Task.run(Task.java:83)\n\tat >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)\n\tat >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)\n\tat >> >> java.lang.Thread.run(Thread.java:679)\n","wait":0}},{"Answer":{"result":false,"details":"Stopped >> by previous failure","wait":0}}] } >> 2013-05-28 21:11:44,552 DEBUG [agent.manager.AgentAttache] >> (AgentManager-Handler-1:null) Seq 4-936706756: No more commands found >> 2013-05-28 21:11:44,552 DEBUG [agent.transport.Request] >> (consoleproxy-1:null) Seq 4-936706756: Received: { Ans: , MgmtId: >> 29066118877352, via: 4, Ver: v1, Flags: 110, { Answer, Answer } } >> 2013-05-28 21:11:44,559 ERROR [cloud.vm.VirtualMachineManagerImpl] >> (consoleproxy-1:null) Failed to start instance VM[ConsoleProxy|v-29-QA] >> com.cloud.utils.exception.CloudRuntimeException: Unable to get answer that >> is of class com.cloud.agent.api.StartAnswer >> at com.cloud.agent.manager.Commands.getAnswer(Commands.java:80) >> at >> com.cloud.vm.VirtualMachineManagerImpl.advanceStart(VirtualMachineManagerImpl.java:873) >> at >> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:550) >> at >> com.cloud.vm.VirtualMachineManagerImpl.start(VirtualMachineManagerImpl.java:543) >> at >> com.cloud.consoleproxy.ConsoleProxyManagerImpl.startProxy(ConsoleProxyManagerImpl.java:561) >> at >> com.cloud.consoleproxy.ConsoleProxyManagerImpl.allocCapacity(ConsoleProxyManagerImpl.java:923) >> at >> com.cloud.consoleproxy.ConsoleProxyManagerImpl.expandPool(ConsoleProxyManagerImpl.java:1667) >> at >> com.cloud.consoleproxy.ConsoleProxyManagerImpl.expandPool(ConsoleProxyManagerImpl.java:157) >> at >> com.cloud.vm.SystemVmLoadScanner.loadScan(SystemVmLoadScanner.java:111) >> Agent log >> 2013-05-29 01:06:07,911 DEBUG [cloud.agent.Agent] >> (agentRequest-Handler-4:null) Request:Seq 4-936706739: { Cmd , MgmtId: >> 29066118877352, via: 4, Ver: v1, Flags: 100111, >> [{"StartCommand":{"vm":{"id":29,"name":"v-29-QA","type":"ConsoleProxy","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":1073741824,"maxRam":1073741824,"arch":"x86_64","os":"Debian >> GNU/Linux 5.0 (32-bit)","bootArgs":" template=domP type=consoleproxy >> host=10.223.49.195 port=8250 name=v-29-QA premium=true zone=1 pod=1 >> guid=Proxy.29 proxy_vm=29 disable_rp_filter=true eth2ip=10.223.122.73 >> eth2mask=255.255.255.192 gateway=10.223.122.65 eth0ip=169.254.1.243 >> eth0mask=255.255.0.0 eth1ip=10.223.50.93 eth1mask=255.255.255.192 >> mgmtcidr=10.223.49.192/26 localgw=10.223.50.65 internaldns1=10.223.110.254 >> dns1=72.52.126.11","rebootOnCrash":false,"enableHA":false,"limitCpuUse":false,"vncPassword":"d099568827911cef","params":{},"uuid":"5a146833-6a8c-44e5-83c0-50f34accf513","disks":[{"id":32,"name":"ROOT-29","mountPoint":"/export/home/rayees/SC_QA_AUTO4/primary","path":"f6f8d865-e9c0-4188-8a33-6c6383ca5075","size":276406784,"type":"ROOT","storagePoolType":"NetworkFilesystem","storagePoolUuid":"fff90cb5-06dd-33b3-8815-d78c08ca01d9","deviceId":0}],"nics":[{"deviceId":2,"networkRateMbps":-1,"defaultNic":true,"uuid":"21175978-96bd-4160-8228-8ad15aa40c66","ip":"10.223.122.73","netmask":"255.255.255.192","gateway":"10.223.122.65","mac":"06:2e:5a:00:00:42","dns1":"72.52.126.11","broadcastType":"Vlan","type":"Public","broadcastUri":"vlan://1221","isolationUri":"vlan://1221","isSecurityGroupEnabled":false},{"deviceId":0,"networkRateMbps":-1,"defaultNic":false,"uuid":"e0651452-a76e-4564-96b2-3d5d51e9bcd6","ip":"169.254.1.243","netmask":"255.255.0.0","gateway":"169.254.0.1","mac":"0e:00:a9:fe:01:f3","broadcastType":"LinkLocal","type":"Control","isSecurityGroupEnabled":false},{"deviceId":1,"networkRateMbps":-1,"defaultNic":false,"uuid":"0765c229-468e-4dfd-8382-24ac49791a8d","ip":"10.223.50.93","netmask":"255.255.255.192","gateway":"10.223.50.65","mac":"06:49:6a:00:00:1a","broadcastType":"Native","type":"Management","isSecurityGroupEnabled":false}]},"hostIp":"10.223.50.66","wait":0}},{"check.CheckSshCommand":{"ip":"169.254.1.243","port":3922,"interval":6,"retries":100,"name":"v-29-QA","wait":0}}] >> } >> 2013-05-29 01:06:07,912 DEBUG [cloud.agent.Agent] >> (agentRequest-Handler-4:null) Processing command: >> com.cloud.agent.api.StartCommand >> 2013-05-29 01:06:07,923 WARN [cloud.agent.Agent] >> (agentRequest-Handler-4:null) Caught: >> com.cloud.utils.exception.CloudRuntimeException: >> org.libvirt.LibvirtException: Storage pool not found: no pool with matching >> uuid >> at >> com.cloud.hypervisor.kvm.storage.LibvirtStorageAdaptor.getStoragePool(LibvirtStorageAdaptor.java:380) >> at >> com.cloud.hypervisor.kvm.storage.KVMStoragePoolManager.getStoragePool(KVMStoragePoolManager.java:72) >> at >> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.createVbd(LibvirtComputingResource.java:3399) >> at >> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:3293) >> at >> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1187) >> at com.cloud.agent.Agent.processRequest(Agent.java:525) >> at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:852) >> at com.cloud.utils.nio.Task.run(Task.java:83) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) >> > create VM issuing create command with primary storage >> > “/export/home/rayees/SC_QA_AUTO4/primary” but its not available in >> > pool-list >> [root@Rack2Host11 ~]# virsh pool-list --all >> Name State Autostart >> ----------------------------------------- >> 1c9fcb9a-aece-453b-8510-4de4b4dbffdc active no >> 41b632b5-40b3-3024-a38b-ea259c72579f active no >> [root@Rack2Host11 ~]# >> [root@Rack2Host11 ~]# >> [root@Rack2Host11 ~]# virsh pool-dumpxml 1c9fcb9a-aece-453b-8510-4de4b4dbffdc >> <pool type='dir'> >> <name>1c9fcb9a-aece-453b-8510-4de4b4dbffdc</name> >> <uuid>1c9fcb9a-aece-453b-8510-4de4b4dbffdc</uuid> >> <capacity unit='bytes'>237777190912</capacity> >> <allocation unit='bytes'>2098745344</allocation> >> <available unit='bytes'>235678445568</available> >> <source> >> </source> >> <target> >> <path>/var/lib/libvirt/images</path> >> <permissions> >> <mode>0700</mode> >> <owner>-1</owner> >> <group>-1</group> >> </permissions> >> </target> >> </pool> >> [root@Rack2Host11 ~]# virsh pool-dumpxml 41b632b5-40b3-3024-a38b-ea259c72579f >> <pool type='netfs'> >> <name>41b632b5-40b3-3024-a38b-ea259c72579f</name> >> <uuid>41b632b5-40b3-3024-a38b-ea259c72579f</uuid> >> <capacity unit='bytes'>11810778316800</capacity> >> <allocation unit='bytes'>9045745336320</allocation> >> <available unit='bytes'>2765032980480</available> >> <source> >> <host name='10.223.110.232'/> >> <dir path='/export/home/rayees/SC_QA_AUTO4/primary2'/> >> <format type='auto'/> >> </source> >> <target> >> <path>/mnt/41b632b5-40b3-3024-a38b-ea259c72579f</path> >> <permissions> >> <mode>0700</mode> >> <owner>-1</owner> >> <group>-1</group> >> </permissions> >> </target> >> </pool> > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira