[ https://issues.apache.org/jira/browse/CLOUDSTACK-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sudha Ponnaganti closed CLOUDSTACK-4499. ---------------------------------------- fixed in 4.2 and 4.2.1 > Xen6.1/Xen6.2 hosts initially transition to 'Alert' and then to 'Up' after > addHost > ---------------------------------------------------------------------------------- > > Key: CLOUDSTACK-4499 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4499 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: API, Management Server, XenServer > Affects Versions: 4.2.0 > Reporter: Prasanna Santhanam > Assignee: Koushik Das > Priority: Critical > Fix For: 4.2.1 > > Attachments: cloud.sql.tar.bz2, mslog.tar.bz2, SMlog > > > The same bug was reported in CLOUDSTACK-3839 which has been determined to be a > UI issue but the problem is with the addHost API and Xenserver hosts 6.1/6.2. > When a freshly provisioned 6.1/6.2 host is added to cloudstack it initially > goes into 'Alert' state thereby failing the storage pool creation > subsequently. > StoragePool addition requires the hosts to be in Up state. But a minute or two > later the Xenserver host automatically moves back to Up state after which > storage pool addition can occur > When the host is added to cloudstack: > 2013-08-26 06:46:22,965 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-1:null) Seq 1-1916403719: Executing request > 2013-08-26 06:46:23,261 DEBUG [xen.resource.CitrixResourceBase] > (DirectAgent-1:null) Can't find a vif on dom0 for link local, creating a new > one > 2013-08-26 06:46:23,274 DEBUG [xen.resource.CitrixResourceBase] > (DirectAgent-1:null) Lowest available Vif device number: 0 for VM: Control > domain on host: apache-81-3 > 2013-08-26 06:46:23,600 WARN [xen.resource.CitrixResourceBase] > (DirectAgent-1:null) Unable to create local link network > The server failed to handle your request, due to an internal error. The > given message may give details useful for debugging the problem. > at com.xensource.xenapi.Types.checkResponse(Types.java:1694) > at com.xensource.xenapi.Connection.dispatch(Connection.java:368) > at > com.cloud.hypervisor.xen.resource.XenServerConnectionPool$XenServerConnection.dispatch(XenServerConnectionPool.java:909) > at com.xensource.xenapi.VIF.plug(VIF.java:846) > Host marked Alert: > 2013-08-26 06:46:23,616 DEBUG [agent.manager.AgentManagerImpl] > (catalina-exec-14:null) Sending Disconnect to listener: > com.cloud.network.NetworkUsageManagerImpl$DirectNetworkStatsListener > 2013-08-26 06:46:23,616 DEBUG [cloud.network.NetworkUsageManagerImpl] > (catalina-exec-14:null) Disconnected called on 1 with status Alert > 2013-08-26 06:46:23,616 DEBUG [agent.manager.AgentManagerImpl] > (catalina-exec-14:null) Sending Disconnect to listener: > com.cloud.consoleproxy.ConsoleProxyListener > 2013-08-26 06:46:23,619 DEBUG [cloud.host.Status] (catalina-exec-14:null) > Transition:[Resource state = Enabled, Agent event = AgentDisconnected, Host > id = 1, name = apache-81-3] > 2013-08-26 06:46:23,625 DEBUG [cloud.host.Status] (catalina-exec-14:null) > Agent status update: [id = 1; name = apache-81-3; old status = Connecting; > event = AgentDisconnected; new status = Alert; old update count = 1; new > update count = 2] > Automatic retry by CloudStack a few moments later: > 2013-08-26 06:46:52,752 DEBUG [host.dao.HostDaoImpl] (ClusteredAgentManager > Timer:null) Completed acquiring hosts for clusters not owned by any > management server > 2013-08-26 06:46:52,756 DEBUG [agent.manager.ClusteredAgentManagerImpl] > (ClusteredAgentManager Timer:null) Found 2 unmanaged direct hosts, processing > connect for them... > 2013-08-26 06:46:52,757 DEBUG [agent.manager.ClusteredAgentManagerImpl] > (ClusteredAgentManager Timer:null) Loading directly connected host > 1(apache-81-3) > Meanwhile Storage Pool addition fails: > 2013-08-26 06:46:52,766 DEBUG [cloud.api.ApiServlet] (catalina-exec-17:null) > ===END=== 10.208.8.5 -- GET > username=root&apiKey=sLUjl2n1c33JJpY4ZzMtObUgp9Ah2xW5inofCclZQQ1V6xp5k3CmAsGhemLwQxVvQY67EiopM_fbWBoJRKCIww&podid=fdf30680-d5c7-425e-bd7f-fb8c52e7cb96&hypervisor=XenServer&clusterid=10ff9a88-54bc-405b-9b4b-c26acf9caca8&zoneid=0431023e-ff78-43b6-a32d-262f36a78cbb&command=addHost&url=http%3A%2F%2Fapache-81-2&signature=oJqOno9flCtPMSSHH%2FZD%2BJl5QXw%3D&response=json > 2013-08-26 06:46:52,773 DEBUG [cloud.api.ApiServlet] (catalina-exec-16:null) > ===START=== 10.208.8.5 -- GET > apiKey=sLUjl2n1c33JJpY4ZzMtObUgp9Ah2xW5inofCclZQQ1V6xp5k3CmAsGhemLwQxVvQY67EiopM_fbWBoJRKCIww&name=z0p0c0ps0&url=nfs%3A%2F%2Fnfs.fmt.vmops.com%3A%2Fexport%2Fautomation%2Facs%2Fprimary&podid=fdf30680-d5c7-425e-bd7f-fb8c52e7cb96&clusterid=10ff9a88-54bc-405b-9b4b-c26acf9caca8&zoneid=0431023e-ff78-43b6-a32d-262f36a78cbb&command=createStoragePool&signature=70CuFbmf58PU4X0QbF1tKbBGT60%3D&response=json > 2013-08-26 06:46:52,787 DEBUG [cloud.network.NetworkModelImpl] > (ClusteredAgentManager Timer:null) Failed to retrive the default label for > storage traffic:zone: 1 hypervisor: XenServer due to:Unable to find the > default physical network with traffic=Storage in the specified zone id > 2013-08-26 06:46:52,839 DEBUG > [datastore.lifecycle.CloudStackPrimaryDataStoreLifeCycleImpl] > (catalina-exec-16:null) createPool Params @ scheme - nfs storageHost - > nfs.fmt.vmops.com hostPath - /export/automation/acs/primary port - -1 > 2013-08-26 06:46:52,889 DEBUG [agent.manager.ClusteredAgentManagerImpl] > (ClusteredAgentManager Timer:null) Loading directly connected host > 2(apache-81-2) > 2013-08-26 06:46:52,892 DEBUG [agent.manager.AgentManagerImpl] > (AgentTaskPool-2:null) Simulating start for resource > 192267e4-d97d-4e37-b899-82b157e3166f id 1 > 2013-08-26 06:46:52,900 DEBUG [cloud.storage.StorageManagerImpl] > (catalina-exec-16:null) Failed to add data store > com.cloud.utils.exception.CloudRuntimeException: No host up to associate a > storage pool with in cluster 1 > at > org.apache.cloudstack.storage.datastore.lifecycle.CloudStackPrimaryDataStoreLifeCycleImpl.attachCluster(CloudStackPrimaryDataStoreLifeCycleImpl.java:371) > at > com.cloud.storage.StorageManagerImpl.createPool(StorageManagerImpl.java:749) > at > com.cloud.storage.StorageManagerImpl.createPool(StorageManagerImpl.java:177) > Host comes back to Up state: > 2013-08-26 06:46:53,898 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-4:null) Seq 1-1519321095: Executing request > 2013-08-26 06:46:54,279 DEBUG [xen.resource.CitrixResourceBase] > (DirectAgent-4:null) already have a vif on dom0 for link local network > 2013-08-26 06:46:54,703 DEBUG [agent.transport.Request] > (AgentTaskPool-2:null) Seq -1--1: Startup request from directly connected > host: { Cmd , MgmtId: -1, via: -1, Ver: v1, Flags: 11, > [{"cpus":4,"speed":2261,"memory":16190149248,"dom0MinMemory":588251136,"poolSync":false,"_clusterVMStates":{},"caps":"xen-3.0-x86_64 > , xen-3.0-x86_32p , hvm-3.0-x86_32 , hvm-3.0-x86_32p , > hvm-3.0-x86_64","pool":"9dfa936a-30f7-b5d1-4d4e-b5a75facb313","hypervisorType":"XenServer","hostDetails":{"product_version":"6.1.0","private.network.device":"Pool-wide > network associated with > eth0","com.cloud.network.Networks.RouterPrivateIpStrategy":"DcGlobal","product_brand":"XenServer","product_version_text_short":"6.1","can_bridge_firewall":"false"},"hypervisorVersion":"6.1.0","type":"Routing","dataCenter":"1","pod":"1","cluster":"1","guid":"192267e4-d97d-4e37-b899-82b157e3166f","name":"apache-81-3","version":"4.2.0","iqn":"iqn.2013-08.com.vmops.fmt:691cf4d5","publicIpAddress":"10.208.8.69","publicNetmask":"255.255.255.192","publicMacAddress":"b8:ac:6f:94:f1:67","privateIpAddress":"10.208.8.69","privateMacAddress":"b8:ac:6f:94:f1:67","privateNetmask":"255.255.255.192","storageIpAddress":"10.208.8.69","storageNetmask":"255.255.255.192","storageMacAddress":"b8:ac:6f:94:f1:67","wait":0},{"totalSize":0,"poolInfo":{"uuid":"ae248ed7-dd2a-cfc8-53f7-f43661c596c8","host":"10.208.8.69","localPath":"lvm","hostPath":"lvm","poolType":"LVM","capacityBytes":491505319936,"availableBytes":491501125632},"resourceType":"STORAGE_POOL","hostDetails":{},"type":"Storage","dataCenter":"1","guid":"192267e4-d97d-4e37-b899-82b157e3166f","wait":0}] > } > 2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] > (AgentTaskPool-2:null) Dispatching resource state event > CREATE_HOST_VO_FOR_DIRECT_CONNECT to > BaremetalPxeManagerImpl_EnhancerByCloudStack_d79b93ae > 2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] > (AgentTaskPool-2:null) Dispatching resource state event > CREATE_HOST_VO_FOR_DIRECT_CONNECT to > PremiumSecondaryStorageManagerImpl_EnhancerByCloudStack_b51c5fde > 2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] > (AgentTaskPool-2:null) Dispatching resource state event > CREATE_HOST_VO_FOR_DIRECT_CONNECT to > LxcServerDiscoverer_EnhancerByCloudStack_449ab3e6 > 2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] > (AgentTaskPool-2:null) Dispatching resource state event > CREATE_HOST_VO_FOR_DIRECT_CONNECT to NiciraNvp > 2013-08-26 06:46:54,709 DEBUG [cloud.resource.ResourceManagerImpl] > (AgentTaskPool-2:null) Dispatching resource state event > CREATE_HOST_VO_FOR_DIRECT_CONNECT to > OvmDiscoverer_EnhancerByCloudStack_ec7ad054 > 2013-08-26 06:46:54,710 DEBUG [cloud.resource.ResourceManagerImpl] > (AgentTaskPool-2:null) Dispatching resource state event > CREATE_HOST_VO_FOR_DIRECT_CONNECT to > XcpServerDiscoverer_EnhancerByCloudStack_831158ad > 2013-08-26 06:46:54,710 INFO [xen.discoverer.XcpServerDiscoverer] > (AgentTaskPool-2:null) Host: apache-81-3 connected with hypervisor type: > XenServer. Checking CIDR... > 2013-08-26 06:46:54,735 DEBUG [cloud.resource.ResourceState] > (AgentTaskPool-2:null) Resource state update: [id = 1; name = apache-81-3; > old state = Enabled; event = InternalCreated; new state = Enabled] > 2013-08-26 06:46:54,736 DEBUG [cloud.host.Status] (AgentTaskPool-2:null) > Transition:[Resource state = Enabled, Agent event = AgentConnected, Host id = > 1, name = apache-81-3] > 2013-08-26 06:46:54,742 DEBUG [cloud.host.Status] (AgentTaskPool-2:null) > Agent status update: [id = 1; name = apache-81-3; old status = Connecting; > event = AgentConnected; new status = Connecting; old update count = 3; new > update count = 4] > 2013-08-26 06:46:54,946 DEBUG [cloud.host.Status] (AgentTaskPool-1:null) > Agent status update: [id = 1; name = apache-81-3; old status = Connecting; > event = Ready; new status = Up; old update count = 4; new update count = 5] > There appear to be two problems: > 1. addHost API returns successfully even if the host has transitioned to > 'Alert' state > 2. Something wrong with Xenserver 6.2/6.1 hosts that makes this happen > This is on the latest off the 4.2 branch (8689f7f). > Attaching management server log and Database Dump. -- This message was sent by Atlassian JIRA (v6.1#6144)