[ https://issues.apache.org/jira/browse/CLOUDSTACK-4080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alena Prokharchyk reassigned CLOUDSTACK-4080: --------------------------------------------- Assignee: Alena Prokharchyk > Router VM is stopped by scavenger thread as part of DeployVMCmd if the > network.gc is set to low value like "10" seconds > ----------------------------------------------------------------------------------------------------------------------- > > Key: CLOUDSTACK-4080 > URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4080 > Project: CloudStack > Issue Type: Bug > Security Level: Public(Anyone can view this level - this is the > default.) > Components: Network Controller > Affects Versions: 4.2.0 > Environment: commit # 7522f811672f66bc0cc13a33f4f3737ef03f22af > Reporter: venkata swamybabu budumuru > Assignee: Alena Prokharchyk > Priority: Blocker > Fix For: 4.2.0 > > Attachments: logs.tgz > > > Steps to reproduce: > 1. Have latest CloudStack setup with at least one advanced zone. > 2. Make sure network.gc.interval and wait are set to "10" seconds > 3. Have at least one network offering of type "isolated" and with all > services enabled where LB is provided by NS and other services are provided > by VR. > mysql> select * from network_offerings where id=15\G > *************************** 1. row *************************** > id: 15 > name: NetworkOffering with NS > uuid: 4aaf5c58-6d45-4213-8c26-0b2b6f6792c5 > unique_name: NetworkOffering with NS > display_text: NetworkOffering with NS > nw_rate: NULL > mc_rate: 10 > traffic_type: Guest > tags: NULL > system_only: 0 > specify_vlan: 0 > service_offering_id: NULL > conserve_mode: 0 > created: 2013-08-05 07:30:38 > removed: NULL > default: 0 > availability: Optional > dedicated_lb_service: 0 > shared_source_nat_service: 0 > sort_key: 0 > redundant_router_service: 0 > state: Enabled > guest_type: Isolated > elastic_ip_service: 0 > eip_associate_public_ip: 0 > elastic_lb_service: 0 > specify_ip_ranges: 0 > inline: 0 > is_persistent: 0 > internal_lb: 0 > public_lb: 1 > egress_default_policy: 1 > concurrent_connections: NULL > mysql> select * from ntwk_offering_service_map where network_offering_id=15; > +----+---------------------+----------------+---------------+---------------------+ > | id | network_offering_id | service | provider | created > | > +----+---------------------+----------------+---------------+---------------------+ > | 58 | 15 | Dhcp | VirtualRouter | 2013-08-05 > 07:30:38 | > | 55 | 15 | Dns | VirtualRouter | 2013-08-05 > 07:30:38 | > | 60 | 15 | Firewall | VirtualRouter | 2013-08-05 > 07:30:38 | > | 59 | 15 | Lb | Netscaler | 2013-08-05 > 07:30:38 | > | 54 | 15 | PortForwarding | VirtualRouter | 2013-08-05 > 07:30:38 | > | 56 | 15 | SourceNat | VirtualRouter | 2013-08-05 > 07:30:38 | > | 53 | 15 | StaticNat | VirtualRouter | 2013-08-05 > 07:30:38 | > | 57 | 15 | UserData | VirtualRouter | 2013-08-05 > 07:30:38 | > | 61 | 15 | Vpn | VirtualRouter | 2013-08-05 > 07:30:38 | > +----+---------------------+----------------+---------------+---------------------+ > mysql> select * from host_details where host_id=4; > +----+---------+-------------------+-------------------------------------------+ > | id | host_id | name | value > | > +----+---------+-------------------+-------------------------------------------+ > | 13 | 4 | deviceName | NetscalerVPXLoadBalancer > | > | 11 | 4 | guid | 1cf71bde-3994-42eb-80e0-046278a1763d > | > | 21 | 4 | ip | 10.147.60.26 > | > | 19 | 4 | lbdevicededicated | false > | > | 23 | 4 | lbdeviceid | 1 > | > | 16 | 4 | name | > 201-NetscalerVPXLoadBalancer-10.147.60.26 | > | 17 | 4 | numretries | 2 > | > | 18 | 4 | password | ck3EWqTylg79ZMj4gG2sHA== > | > | 20 | 4 | physicalNetworkId | 201 > | > | 15 | 4 | privateinterface | 1/2 > | > | 14 | 4 | publicinterface | 1/3 > | > | 12 | 4 | username | nsroot > | > | 22 | 4 | zoneId | 2 > | > +----+---------+-------------------+-------------------------------------------+ > 4. As a non-ROOT domain user, Try to deploy a VM using the network that is > created using the above n/w offering. > mysql> select * from networks where id=242\G > *************************** 1. row *************************** > id: 242 > name: test > uuid: c8028134-77ab-415a-bdb9-d9378754479b > display_text: test > traffic_type: Guest > broadcast_domain_type: Vlan > broadcast_uri: NULL > gateway: 10.0.48.1 > cidr: 10.0.48.0/20 > mode: Dhcp > network_offering_id: 15 > physical_network_id: 200 > data_center_id: 1 > guru_name: ExternalGuestNetworkGuru > state: Allocated > related: 242 > domain_id: 2 > account_id: 4 > dns1: 10.103.128.16 > dns2: NULL > guru_data: NULL > set_fields: 0 > acl_type: Account > network_domain: cs4cloud.internal > reservation_id: 803e1334-ed30-4980-a6f2-299427724bb9 > guest_type: Isolated > restart_required: 0 > created: 2013-08-05 11:27:19 > removed: NULL > specify_ip_ranges: 0 > vpc_id: NULL > ip6_gateway: NULL > ip6_cidr: NULL > network_cidr: NULL > display_network: 1 > network_acl_id: NULL > Observations: > (i) deployVMCmd goes fine without any issues but, network scavenger is going > and shutting down the network immediately after startAnswer for userVM > (ii) Here is the deployVMCmd : > 2013-08-05 16:57:19,427 DEBUG [cloud.api.ApiServlet] (catalina-exec-22:null) > ===END=== 10.252.192.25 -- GET > command=deployVirtualMachine&zoneId=7b6b3c07-7e33-483f-b2a1-2f89a0d9ff96&templateId=4643adee-fd8e-11e2-9c07-069f2c0000aa&hypervisor=KVM&serviceOfferingId=d42e0af6-370b-4a4f-a318-98d1d2a9a8e3&networkIds=c8028134-77ab-415a-bdb9-d9378754479b&displayname=test&name=test&response=json&sessionkey=YGtsbnrjR7V3vmhttXR20I2v8L0%3D&_=1375702049488 > (iii) The above command initiated a router VM deployment > 2013-08-05 16:57:25,237 DEBUG [agent.transport.Request] > (Job-Executor-8:job-59 = [ 17db0422-7b06-4bb6-b78a-ae9728bd26d1 ]) Seq > 1-1321009484: Sending { Cmd , MgmtId: 7280707764394, via: 1, Ver: v1, Flags: > 100011, > [{"com.cloud.agent.api.StartCommand":{"vm":{"id":14,"name":"r-14-VM","type":"DomainRouter","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":134217728,"maxRam":134217728,"arch":"x86_64","os":"Debian > GNU/Linux 5.0 (32-bit)","bootArgs":" template=domP name=r-14-VM > eth2ip=10.147.44.67 eth2mask=255.255.255.0 gateway=10.147.44.1 > eth0ip=10.0.48.1 eth0mask=255.255.240.0 domain=cs4cloud.internal > dhcprange=10.0.48.1 eth1ip=169.254.3.57 eth1mask=255.255.0.0 type=router > disable_rp_filter=true > dns1=10.103.128.16","rebootOnCrash":false,"enableHA":true,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"821327d55071659e","params":{},"uuid":"3980de9d-1b91-4e7c-ae36-2b9a9d08ef38","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"a11a7197-2aed-40e6-9569-6787c577ab2c","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"5458182e-bfcb-351c-97ed-e7223bca2b8e","id":1,"poolType":"NetworkFilesystem","host":"10.147.28.7","path":"/export/home/swamy/primary.campo.kvm.1.zone","port":2049}},"name":"ROOT-14","size":276162048,"path":"458958b5-5497-477e-9e53-1227a0187688","volumeId":17,"vmName":"r-14-VM","accountId":4,"format":"QCOW2","id":17,"hypervisorType":"None"}},"diskSeq":0,"type":"ROOT"}],"nics":[{"deviceId":2,"networkRateMbps":200,"defaultNic":true,"uuid":"65fd9e0c-5b15-4181-9ab1-4a13fea4739e","ip":"10.147.44.67","netmask":"255.255.255.0","gateway":"10.147.44.1","mac":"06:2a:3a:00:00:12","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Public","broadcastUri":"vlan://44","isolationUri":"vlan://44","isSecurityGroupEnabled":false},{"deviceId":0,"networkRateMbps":200,"defaultNic":false,"uuid":"d2a0ff89-a6ce-4005-a875-d36787318342","ip":"10.0.48.1","netmask":"255.255.240.0","gateway":"10.0.48.1","mac":"02:00:6b:6d:00:02","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://909","isolationUri":"vlan://909","isSecurityGroupEnabled":false},{"deviceId":1,"networkRateMbps":-1,"defaultNic":false,"uuid":"0a187643-dc28-48b5-bf99-8acbe5f753c2","ip":"169.254.3.57","netmask":"255.255.0.0","gateway":"169.254.0.1","mac":"0e:00:a9:fe:03:39","broadcastType":"LinkLocal","type":"Control","isSecurityGroupEnabled":false}]},"hostIp":"10.147.40.11","executeInSequence":false,"wait":0}},{"com.cloud.agent.api.check.CheckSshCommand":{"ip":"169.254.3.57","port":3922,"interval":6,"retries":100,"name":"r-14-VM","wait":0}},{"com.cloud.agent.api.GetDomRVersionCmd":{"accessDetails":{"router.ip":"169.254.3.57","router.name":"r-14-VM"},"wait":0}},{},{"com.cloud.agent.api.routing.IpAssocCommand":{"ipAddresses":[{"accountId":4,"publicIp":"10.147.44.67","sourceNat":true,"add":true,"oneToOneNat":false,"firstIP":true,"vlanId":"44","vlanGateway":"10.147.44.1","vlanNetmask":"255.255.255.0","vifMacAddress":"06:b0:5a:00:00:12","networkRate":200,"trafficType":"Public"}],"accessDetails":{"router.guest.ip":"10.0.48.1","zone.network.type":"Advanced","router.ip":"169.254.3.57","router.name":"r-14-VM"},"wait":0}}] > } > (iv) Router VM started successfully > 2013-08-05 16:58:40,412 DEBUG [agent.transport.Request] > (AgentManager-Handler-5:null) Seq 1-1321009484: Processing: { Ans: , MgmtId: > 7280707764394, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.StartAnswer":{"vm":{"id":14,"name":"r-14-VM","type":"DomainRouter","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":134217728,"maxRam":134217728,"arch":"x86_64","os":"Debian > GNU/Linux 5.0 (32-bit)","bootArgs":" template=domP name=r-14-VM > eth2ip=10.147.44.67 eth2mask=255.255.255.0 gateway=10.147.44.1 > eth0ip=10.0.48.1 eth0mask=255.255.240.0 domain=cs4cloud.internal > dhcprange=10.0.48.1 eth1ip=169.254.3.57 eth1mask=255.255.0.0 type=router > disable_rp_filter=true > dns1=10.103.128.16","rebootOnCrash":false,"enableHA":true,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"821327d55071659e","vncAddr":"10.147.40.11","params":{},"uuid":"3980de9d-1b91-4e7c-ae36-2b9a9d08ef38","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"a11a7197-2aed-40e6-9569-6787c577ab2c","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"5458182e-bfcb-351c-97ed-e7223bca2b8e","id":1,"poolType":"NetworkFilesystem","host":"10.147.28.7","path":"/export/home/swamy/primary.campo.kvm.1.zone","port":2049}},"name":"ROOT-14","size":276162048,"path":"458958b5-5497-477e-9e53-1227a0187688","volumeId":17,"vmName":"r-14-VM","accountId":4,"format":"QCOW2","id":17,"hypervisorType":"None"}},"diskSeq":0,"type":"ROOT"}],"nics":[{"deviceId":2,"networkRateMbps":200,"defaultNic":true,"uuid":"65fd9e0c-5b15-4181-9ab1-4a13fea4739e","ip":"10.147.44.67","netmask":"255.255.255.0","gateway":"10.147.44.1","mac":"06:2a:3a:00:00:12","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Public","broadcastUri":"vlan://44","isolationUri":"vlan://44","isSecurityGroupEnabled":false},{"deviceId":0,"networkRateMbps":200,"defaultNic":false,"uuid":"d2a0ff89-a6ce-4005-a875-d36787318342","ip":"10.0.48.1","netmask":"255.255.240.0","gateway":"10.0.48.1","mac":"02:00:6b:6d:00:02","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://909","isolationUri":"vlan://909","isSecurityGroupEnabled":false},{"deviceId":1,"networkRateMbps":-1,"defaultNic":false,"uuid":"0a187643-dc28-48b5-bf99-8acbe5f753c2","ip":"169.254.3.57","netmask":"255.255.0.0","gateway":"169.254.0.1","mac":"0e:00:a9:fe:03:39","broadcastType":"LinkLocal","type":"Control","isSecurityGroupEnabled":false}]},"result":true,"wait":0}},{"com.cloud.agent.api.check.CheckSshAnswer":{"result":true,"wait":0}},{"com.cloud.agent.api.GetDomRVersionAnswer":{"templateVersion":"Cloudstack > Release 4.2.0 Thu Jun 13 04:15:09 UTC > 2013","scriptsVersion":"0026a7d7d957616f59bdeab0c49258bb","result":true,"details":"Cloudstack > Release 4.2.0 Thu Jun 13 04:15:09 UTC > 2013&0026a7d7d957616f59bdeab0c49258bb","wait":0}},{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-14-VM","bytesSent":0,"bytesReceived":0,"result":true,"wait":0}},{"com.cloud.agent.api.routing.IpAssocAnswer":{"results":["10.147.44.67 > - success"],"result":true,"wait":0}}] } > (v) After the router VM is up, it triggered startComand for userVM and that > as well went fine. > 2013-08-05 16:58:43,001 DEBUG [agent.transport.Request] > (AgentManager-Handler-14:null) Seq 1-1321009496: Processing: { Ans: , > MgmtId: 7280707764394, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.StartAnswer":{"vm":{"id":13,"name":"i-4-13-VM","type":"User","cpus":1,"minSpeed":500,"maxSpeed":500,"minRam":536870912,"maxRam":536870912,"arch":"x86_64","os":"CentOS > 5.5 > (64-bit)","bootArgs":"","rebootOnCrash":false,"enableHA":false,"limitCpuUse":false,"enableDynamicallyScaleVm":false,"vncPassword":"b1f3f5cddd630be3","vncAddr":"10.147.40.11","params":{},"uuid":"773ccd08-cfef-41ad-8c43-30a169630d0b","disks":[{"data":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"ffa1c5ad-cdca-40dc-93bd-03c1dbdd4caf","volumeType":"ROOT","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"5458182e-bfcb-351c-97ed-e7223bca2b8e","id":1,"poolType":"NetworkFilesystem","host":"10.147.28.7","path":"/export/home/swamy/primary.campo.kvm.1.zone","port":2049}},"name":"ROOT-13","size":8589934592,"path":"ad4b1306-a145-4944-b296-775671c9624b","volumeId":16,"vmName":"i-4-13-VM","accountId":4,"format":"QCOW2","id":16,"hypervisorType":"None"}},"diskSeq":0,"type":"ROOT"},{"data":{"org.apache.cloudstack.storage.to.TemplateObjectTO":{"id":0,"format":"ISO","accountId":0,"hvm":false}},"diskSeq":3,"type":"ISO"}],"nics":[{"deviceId":0,"networkRateMbps":200,"defaultNic":true,"uuid":"302aa348-f8aa-4872-ad59-142500ed9f63","ip":"10.0.49.81","netmask":"255.255.240.0","gateway":"10.0.48.1","mac":"02:00:1a:0c:00:01","dns1":"10.103.128.16","broadcastType":"Vlan","type":"Guest","broadcastUri":"vlan://909","isolationUri":"vlan://909","isSecurityGroupEnabled":false}]},"result":true,"wait":0}}] > } > (vi) One additional observation here : After the router VM started, I could > see the nics.state as "Allocated" for userVM for sometime. > *************************** 33. row *************************** > id: 33 > uuid: 302aa348-f8aa-4872-ad59-142500ed9f63 > instance_id: 13 > mac_address: 02:00:1a:0c:00:01 > ip4_address: 10.0.49.81 > netmask: NULL > gateway: NULL > ip_type: Ip4 > broadcast_uri: NULL > network_id: 242 > mode: Dhcp > state: Allocated > strategy: Start > reserver_name: ExternalGuestNetworkGuru > reservation_id: NULL > device_id: 0 > update_time: 2013-08-05 16:57:19 > isolation_uri: NULL > ip6_address: NULL > default_nic: 1 > vm_type: User > created: 2013-08-05 11:27:19 > removed: NULL > ip6_gateway: NULL > ip6_cidr: NULL > secondary_ip: 0 > display_nic: 1 > (vii) Once the UserVM is up then I see the nics.state for userVM nic as > "Reserved" > mysql> select * from nics where instance_id=13\G > *************************** 1. row *************************** > id: 33 > uuid: 302aa348-f8aa-4872-ad59-142500ed9f63 > instance_id: 13 > mac_address: 02:00:1a:0c:00:01 > ip4_address: 10.0.49.81 > netmask: 255.255.240.0 > gateway: 10.0.48.1 > ip_type: Ip4 > broadcast_uri: vlan://909 > network_id: 242 > mode: Dhcp > state: Reserved > strategy: Start > reserver_name: ExternalGuestNetworkGuru > reservation_id: 803e1334-ed30-4980-a6f2-299427724bb9 > device_id: 0 > update_time: 2013-08-05 16:58:40 > isolation_uri: vlan://909 > ip6_address: NULL > default_nic: 1 > vm_type: User > created: 2013-08-05 11:27:19 > removed: NULL > ip6_gateway: NULL > ip6_cidr: NULL > secondary_ip: 0 > display_nic: 1 > 1 row in set (0.00 sec) > (viii) As soon the startAnswer Comes for the userVM, I see that there is > network shutdown initiated by network scavenger thread. > Here is the snippet from mgmt server logs. > 013-08-05 16:58:45,756 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-2:null) SeqA 2-790: Processing Seq 2-790: { Cmd , > MgmtId: -1, via: 2, Ver: v1, Flags: 11, > [{"com.cloud.agent.api.ConsoleProxyLoadReportCommand":{"_proxyVmId":2,"_loadInfo":"{\n > \"connections\": []\n}","wait":0}}] } > 2013-08-05 16:58:45,762 DEBUG [agent.manager.AgentManagerImpl] > (AgentManager-Handler-2:null) SeqA 2-790: Sending Seq 2-790: { Ans: , > MgmtId: 7280707764394, via: 2, Ver: v1, Flags: 100010, > [{"com.cloud.agent.api.AgentControlAnswer":{"result":true,"wait":0}}] } > 2013-08-05 16:58:46,685 DEBUG [network.resource.NetscalerResource] > (DirectAgent-169:null) Netscaler load balancer 10.147.60.26 successfully > executed IPAssocCommand to remove IP > com.cloud.agent.api.to.IpAddressTO@1e6c629 > 2013-08-05 16:58:46,685 DEBUG [agent.manager.DirectAgentAttache] > (DirectAgent-169:null) Seq 8-1242890258: Response Received: > 2013-08-05 16:58:46,685 DEBUG [agent.transport.Request] > (DirectAgent-169:null) Seq 8-1242890258: Processing: { Ans: , MgmtId: > 7280707764394, via: 8, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.routing.IpAssocAnswer":{"results":["null - > success"],"result":true,"wait":0}}] } > 2013-08-05 16:58:46,686 DEBUG [agent.transport.Request] > (Network-Scavenger-1:null) Seq 8-1242890258: Received: { Ans: , MgmtId: > 7280707764394, via: 8, Ver: v1, Flags: 10, { IpAssocAnswer } } > 2013-08-05 16:58:46,800 DEBUG > [cloud.network.ExternalLoadBalancerDeviceManagerImpl] > (Network-Scavenger-1:null) External load balancer has shut down the guest > network for account dom1Acc2(id = 4) with VLAN tag 909 > 2013-08-05 16:58:46,804 DEBUG [cloud.network.NetworkManagerImpl] > (Network-Scavenger-1:null) Sending network shutdown to VirtualRouter > 2013-08-05 16:58:46,808 DEBUG > [network.router.VirtualNetworkApplianceManagerImpl] > (Network-Scavenger-1:null) Stopping router VM[DomainRouter|r-14-VM] > 2013-08-05 16:58:46,817 DEBUG [cloud.capacity.CapacityManagerImpl] > (Network-Scavenger-1:null) VM state transitted from :Running to Stopping with > event: StopRequestedvm's original host id: 1 new host id: 1 host id before > state transition: 1 > 2013-08-05 16:58:46,945 DEBUG [agent.transport.Request] > (AgentManager-Handler-1:null) Seq 1-1321009498: Processing: { Ans: , MgmtId: > 7280707764394, via: 1, Ver: v1, Flags: 10, > [{"com.cloud.agent.api.NetworkUsageAnswer":{"routerName":"r-14-VM","bytesSent":0,"bytesReceived":0,"result":true,"details":"","wait":0}}] > } > Attaching all the required logs along with db dump to the bug. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira