看来是CS找不到谁是pool master了,你要看看你的两个xenserver。 一般因为2个节点的集群,出现断网或者断电造成的。 去找xenserver对应的解决办法就好。
2013-05-16 刘宇超 Richard Liu 发件人: tao huang 发送时间: 2013-05-16 16:03:28 收件人: users-cn 抄送: 主题: Re: 突然不能创建虚机,也不能重启虚机了 这两个是两台host的IP 2013/5/16 tanthalas <[email protected]> > > 先解释一下这俩ip是分配给谁的 > 192.168.1.38 > 192.168.1.36 > > > 2013-05-16 > 刘宇超 Richard Liu > > > > > 发件人: tao huang > 发送时间: 2013-05-16 15:50:33 > 收件人: users-cn > 抄送: > 主题: 突然不能创建虚机,也不能重启虚机了 > > 各位好,cloudstack 3.0 +xenserver6.0环境,高级网络 > 突然无法创建虚机,感觉是主存储出了问题,但是看状态是正常的,如何排查呢,谢谢指导 > 2013-05-16 15:29:40,062 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-369:null) Seq 3-939886253: Response Received: > 2013-05-16 15:29:40,062 DEBUG [agent.transport.Request] > (DirectAgent-369:null) Seq 3-939886253: Processing: { Ans: , MgmtId: > 33932935549676, via: 3, Ver: v1, Flags: 110, > [{"StopAnswer":{"result":false,"details":"Exception: > com.cloud.utils.exception.CloudRuntimeException\nMessage: *Unable to reset > master of slave 192.168.1.38 to 192.168.1.36 due to > org.apache.xmlrpc.XmlRpcException: Failed to create input stream: Read > timed out\nStack: com.cloud.utils.exception.CloudRuntimeException: Unable > to reset master of slave 192.168.1.38 to 192.168.1.36 due to > org.apache.xmlrpc.*XmlRpcException: Failed to create input stream: Read > timed out\n\tat > > com.cloud.hypervisor.xen.resource.XenServerConnectionPool.PoolEmergencyResetMaster(XenServerConnectionPool.java:439)\n\tat > > com.cloud.hypervisor.xen.resource.XenServerConnectionPool.connect(XenServerConnectionPool.java:657)\n\tat > > com.cloud.hypervisor.xen.resource.CitrixResourceBase.getConnection(CitrixResourceBase.java:5090)\n\tat > > com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:3284)\n\tat > > com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:416)\n\tat > > com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:69)\n\tat > > com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:187)\n\tat > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)\n\tat > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)\n\tat > java.util.concurrent.FutureTask.run(FutureTask.java:166)\n\tat > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)\n\tat > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)\n\tat > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)\n\tat > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)\n\tat > java.lang.Thread.run(Thread.java:636)\n","wait":0}}] } > 2013-05-16 15:29:40,062 DEBUG [agent.manager.AgentAttache] > (DirectAgent-369:null) Seq 3-939886253: No more commands found > 2013-05-16 15:29:40,062 DEBUG [agent.transport.Request] > (Job-Executor-6:job-18181) Seq 3-939886253: Received: { Ans: , MgmtId: > 33932935549676, via: 3, Ver: v1, Flags: 110, { StopAnswer } } > 2013-05-16 15:29:40,062 DEBUG [cloud.vm.VirtualMachineManagerImpl] > (Job-Executor-6:job-18181) Unable to stop VM due to Exception: > com.cloud.utils.exception.CloudRuntimeException > Message: Unable to reset master of slave 192.168.1.38 to 192.168.1.36 due > to org.apache.xmlrpc.XmlRpcException: Failed to create input stream: Read > timed out > Stack: com.cloud.utils.exception.CloudRuntimeException: Unable to reset > master of slave 192.168.1.38 to 192.168.1.36 due to > org.apache.xmlrpc.XmlRpcException: Failed to create input stream: Read > timed out > at > > com.cloud.hypervisor.xen.resource.XenServerConnectionPool.PoolEmergencyResetMaster(XenServerConnectionPool.java:439) > at > > com.cloud.hypervisor.xen.resource.XenServerConnectionPool.connect(XenServerConnectionPool.java:657) > at > > com.cloud.hypervisor.xen.resource.CitrixResourceBase.getConnection(CitrixResourceBase.java:5090) > at > > com.cloud.hypervisor.xen.resource.CitrixResourceBase.execute(CitrixResourceBase.java:3284) > at > > com.cloud.hypervisor.xen.resource.CitrixResourceBase.executeRequest(CitrixResourceBase.java:416) > at > > com.cloud.hypervisor.xen.resource.XenServer56Resource.executeRequest(XenServer56Resource.java:69) > at > > com.cloud.agent.manager.DirectAgentAttache$Task.run(DirectAgentAttache.java:187) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165) > at > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:636) > 2013-05-16 15:29:40,062 WARN [cloud.vm.VirtualMachineManagerImpl] > (Job-Executor-6:job-18181) Failed to stop vm VM[DomainRouter|r-2237-VM] in > Starting state as a part of cleanup process > 2013-05-16 15:29:40,066 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) DeploymentPlanner allocation algorithm: random > 2013-05-16 15:29:40,066 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) Trying to allocate a host and storage pools from > dc:1, pod:null,cluster:null, requested cpu: 500, requested ram: 134217728 > 2013-05-16 15:29:40,066 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) Is ROOT volume READY (pool already allocated)?: > No > 2013-05-16 15:29:40,066 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) Searching all possible resources under this > Zone: 1 > 2013-05-16 15:29:40,068 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) Listing clusters in order of aggregate capacity, > that have (atleast one host with) enough CPU and RAM capacity under this > Zone: 1 > 2013-05-16 15:29:40,069 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) CPUOverprovisioningFactor considered: 3.0 > 2013-05-16 15:29:40,077 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) Checking resources in Cluster: 1 under Pod: 1 > 2013-05-16 15:29:40,077 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) Calling HostAllocators to find suitable hosts > 2013-05-16 15:29:40,077 DEBUG [allocator.impl.FirstFitAllocator] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Looking for hosts in > dc: 1 pod:1 cluster:1 > 2013-05-16 15:29:40,080 DEBUG [allocator.impl.FirstFitAllocator] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) FirstFitAllocator has 7 > hosts to check for allocation: [Host[-4-Routing], Host[-2-Routing], > Host[-1-Routing], Host[-3-Routing], Host[-8-Routing], Host[-7-Routing], > Host[-15-Routing]] > 2013-05-16 15:29:40,088 DEBUG [allocator.impl.FirstFitAllocator] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Found 7 hosts for > allocation after prioritization: [Host[-4-Routing], Host[-2-Routing], > Host[-1-Routing], Host[-3-Routing], Host[-8-Routing], Host[-7-Routing], > Host[-15-Routing]] > 2013-05-16 15:29:40,088 DEBUG [allocator.impl.FirstFitAllocator] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Looking for > speed=500Mhz, Ram=128 > 2013-05-16 15:29:40,091 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Checking if host: 4 has > enough capacity for requested CPU: 500 and requested RAM: 134217728 , > cpuOverprovisioningFactor: 3.0 > 2013-05-16 15:29:40,093 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Hosts's actual total > CPU: 63984 and CPU after applying overprovisioning: 191952 > 2013-05-16 15:29:40,093 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Free CPU: 161952 , > Requested CPU: 500 > 2013-05-16 15:29:40,093 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Free RAM: 71156002944 , > Requested RAM: 134217728 > 2013-05-16 15:29:40,093 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Host has enough CPU and > RAM available > 2013-05-16 15:29:40,093 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) STATS: Can alloc CPU > from host: 4, used: 29500, reserved: 500, actual total: 63984, total with > overprovisioning: 191952; requested cpu:500,alloc_from_last_host?:false > ,considerReservedCapacity?: true > 2013-05-16 15:29:40,093 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) STATS: Can alloc MEM > from host: 4, used: 28856811520, reserved: 536870912, total: 100549685376; > requested mem: 134217728,alloc_from_last_host?:false > ,considerReservedCapacity?: true > 2013-05-16 15:29:40,093 DEBUG [allocator.impl.FirstFitAllocator] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Found a suitable host, > adding to list: 4 > 2013-05-16 15:29:40,095 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Checking if host: 2 has > enough capacity for requested CPU: 500 and requested RAM: 134217728 , > cpuOverprovisioningFactor: 3.0 > 2013-05-16 15:29:40,097 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Hosts's actual total > CPU: 63984 and CPU after applying overprovisioning: 191952 > 2013-05-16 15:29:40,097 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Free CPU: 151452 , > Requested CPU: 500 > 2013-05-16 15:29:40,097 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Free RAM: 58539536512 , > Requested RAM: 134217728 > 2013-05-16 15:29:40,097 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Host has enough CPU and > RAM available > 2013-05-16 15:29:40,097 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) STATS: Can alloc CPU > from host: 2, used: 40500, reserved: 0, actual total: 63984, total with > overprovisioning: 191952; requested cpu:500,alloc_from_last_host?:false > ,considerReservedCapacity?: true > 2013-05-16 15:29:40,097 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) STATS: Can alloc MEM > from host: 2, used: 42010148864, reserved: 0, total: 100549685376; > requested mem: 134217728,alloc_from_last_host?:false > ,considerReservedCapacity?: true > 2013-05-16 15:29:40,097 DEBUG [allocator.impl.FirstFitAllocator] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Found a suitable host, > adding to list: 2 > 2013-05-16 15:29:40,114 DEBUG [allocator.impl.FirstFitAllocator] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Found a suitable host, > adding to list: 15 > *2013-05-16 15:29:40,114 DEBUG [allocator.impl.FirstFitAllocator] > (Job-Executor-6:job-18181 FirstFitRoutingAllocator) Host Allocator > returning 6 suitable hosts* > *2013-05-16 15:29:40,116 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) Checking suitable pools for volume (Id, Type): > (2438,ROOT)* > *2013-05-16 15:29:40,116 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) We need to allocate new storagepool for this > volume* > *2013-05-16 15:29:40,116 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) Calling StoragePoolAllocators to find suitable > pools* > *2013-05-16 15:29:40,117 DEBUG > [storage.allocator.FirstFitStoragePoolAllocator] (Job-Executor-6:job-18181) > Looking for pools in dc: 1 pod:1 cluster:1* > *2013-05-16 15:29:40,118 DEBUG > [storage.allocator.FirstFitStoragePoolAllocator] (Job-Executor-6:job-18181) > FirstFitStoragePoolAllocator has 1 pools to check for allocation* > *2013-05-16 15:29:40,118 DEBUG > [storage.allocator.AbstractStoragePoolAllocator] (Job-Executor-6:job-18181) > Checking if storage pool is suitable, name: ps ,poolId: 200* > *2013-05-16 15:29:40,118 DEBUG > [storage.allocator.AbstractStoragePoolAllocator] (Job-Executor-6:job-18181) > StoragePool is in avoid set, skipping this pool* > *2013-05-16 15:29:40,118 DEBUG > [storage.allocator.FirstFitStoragePoolAllocator] (Job-Executor-6:job-18181) > FirstFitStoragePoolAllocator returning 0 suitable storage pools* > *2013-05-16 15:29:40,118 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) No suitable pools found for volume: > Vol[2438|vm=2237|ROOT] under cluster: 1* > *2013-05-16 15:29:40,118 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) No suitable pools found* > *2013-05-16 15:29:40,118 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) No suitable storagePools found under this > Cluster: 1* > *2013-05-16 15:29:40,118 DEBUG [cloud.deploy.FirstFitPlanner] > (Job-Executor-6:job-18181) Could not find suitable Deployment Destination > for this VM under any clusters, returning.* > *2013-05-16 15:29:40,135 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181) VM state transitted from :Starting to Stopped > with event: OperationFailedvm's original host id: null new host id: null > host id before state transition: 3* > *2013-05-16 15:29:40,139 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181) Hosts's actual total CPU: 63984 and CPU after > applying overprovisioning: 191952* > *2013-05-16 15:29:40,139 DEBUG [cloud.capacity.CapacityManagerImpl] > (Job-Executor-6:job-18181) release cpu from host: 3, old used: > 29000,reserved: 0, actual total: 63984, total with overprovisioning: > 191952; new used: 28500,reserved:0; movedfromreserved: > false,moveToReserveredfalse* >
