Hi guys, 

Coming back to the problem with live migration. I've done some more testing and 
I think there is an issue (probably introduced since 4.4.x). 

I have manually set the vlan://<number> for the broadcast and isolation _uri 
values in the data base. This has indeed solved the migration problem. I am 
able to migrate vm after making the change. 

However, a bigger problem has surfaced. After stopping the vm, I am no longer 
able to start it, even though I've not had any issues stopping/starting the vm 
prior to making db change. I've also noticed that after the vm is stopped, the 
value of both broadcast and isolation URIs is reset back to NULL. Not sure if 
this is the expected behaviour or not. 

Could someone help me with getting to the bottom of this issue? 

Thanks 

Andrei 

----- Original Message -----

From: "Andrija Panic" <andrija.pa...@gmail.com> 
To: d...@cloudstack.apache.org 
Cc: users@cloudstack.apache.org 
Sent: Friday, 15 May, 2015 2:01:30 PM 
Subject: Re: ACS 4.5.1 KVM live migration problem 

Ok, but since they are guest, it confuses me - is this advanced zone with 
vlan, right ? Then my understanding all NICs (of user VM) needs to have 
some isolation method... 

Anyway - I'm running advanced zone + vlans, and all VMS (VMs behind VPC 
and VMS on internet/public network - but still that's Guest network) - 
still all of them have some vlan://xxxxx value. 

For VR, SSVM, CPVM - there are NICs on "ACS public" network that doesnt use 
vlan - they have "vlan://untagged", and "NULL" is only used for LinkLocal 
(169.x) NICs, and for mgmt/sec-storage NIC for SSVM/CPVM in my case. 



On 15 May 2015 at 13:47, Andrei Mikhailovsky <and...@arhont.com> wrote: 

> Andrija, 
> 
> I've ran the command and it showed me a bunch of running vms with NULLs. I 
> would roughly say about 20% of my total running vms do have NULL under the 
> isolation and broadcast URIs. 
> 
> All of these vms are working perfectly well (in terms of network 
> connectivity) and there is nothing special about them. They all have at 
> least one guest NIC. 
> 
> Andrei 
> ----- Original Message ----- 
> 
> From: "Andrija Panic" <andrija.pa...@gmail.com> 
> To: d...@cloudstack.apache.org 
> Cc: users@cloudstack.apache.org 
> Sent: Friday, 15 May, 2015 12:34:24 PM 
> Subject: Re: ACS 4.5.1 KVM live migration problem 
> 
> Andrei, 
> 
> select instance_id,isolation_uri,broadcast_uri from nics where instance_id 
> in (select id from vm_instance where state='Running' and name not like 
> 'r-%' and name not like 'v-%' and name not like 's-%') order by 
> instance_id; 
> 
> This gives me every niC, that does not belong to router or SSVm CPVM....I 
> always have vlan values - since this is all Guest NICs - they must have 
> vlan ID... 
> NULL values are only present when VM is deleted/stoped in my case... 
> 
> Can you check your VM 664 - what is so specific about it ? 
> all NICs (in my understanding, if this is advacned zone) must have some 
> vlan, can not be NULL or untagged ? 
> 
> On 15 May 2015 at 12:58, Andrei Mikhailovsky <and...@arhont.com> wrote: 
> 
> > 
> > 
> > Hi Andrija, Marcus, 
> > 
> > Thanks for your comments and suggestions. I've checked the cloud.nics 
> table 
> > 
> > mysql> select instance_id,isolation_uri,broadcast_uri from nics where 
> > instance_id=564 or instance_id=664 or instance_id=1111; 
> > +-------------+---------------+---------------+ 
> > | instance_id | isolation_uri | broadcast_uri | 
> > +-------------+---------------+---------------+ 
> > | 564 | vlan://96 | vlan://96 | 
> > | 664 | NULL | NULL | 
> > | 1111 | vlan://1127 | vlan://1127 | 
> > +-------------+---------------+---------------+ 
> > 
> > 
> > From my tests, instance_ids 564 and 1111 are migrating correctly, but 
> > instance 664 is not ans showing the npe similar to the one i've given. 
> > 
> > 
> > Is this what is causing the migration issues? If so, should i change all 
> > isolation_uri and broadcast_uri to the corresponding network vlan ids? 
> > 
> > Thanks 
> > 
> > Andrei 
> > 
> > ----- Original Message ----- 
> > 
> > From: "Andrija Panic" <andrija.pa...@gmail.com> 
> > To: d...@cloudstack.apache.org 
> > Sent: Thursday, 14 May, 2015 4:00:07 PM 
> > Subject: Re: Fwd: ACS 4.5.1 KVM live migration problem 
> > 
> > That would probably be a bug that I had...but we updated main VLAN table 
> > with change URI or something... Marcus saved me that time :) 
> > Andrei, please provide more info and the info Marcus said, I will try to 
> > compare my values with yours if of any help. 
> > 
> > On 14 May 2015 at 16:56, Marcus <shadow...@gmail.com> wrote: 
> > 
> > > So, I vaguely remember an issue introduced a little over a year ago 
> where 
> > > the broadcast domain value of the nic was changed from a URI to just a 
> > vlan 
> > > ID, which worked for vlans but broke vxlan and some other things. If I 
> > > remember correctly, there would be a small set of installs during this 
> > > period that wouldn't have created their nics with the correct broadcast 
> > > domain value. I don't remember which versions were doing this but I do 
> > know 
> > > there's a JIRA ticket and a paper trail on how people were fixing it. 
> The 
> > > code that broke the URI was backed out. VMs created with the bad code 
> > would 
> > > not be compatible with the new or the old versions of code. 
> > > 
> > > I was under the impression at the time that there was some SQL provided 
> > to 
> > > update the values during an upgrade, perhaps that never made it in, or 
> > > somehow got skipped during your upgrade process. At any rate, since 
> there 
> > > is a null pointer on broadcast domain type, you may check your 
> > > nics/networks the MySQL db and verify that the broadcast/isolation 
> types 
> > > are URI format and not just a number. Or try to find the bug I'm 
> > referring 
> > > to from around April last year. 
> > > On May 14, 2015 5:04 AM, "Andrei Mikhailovsky" <and...@arhont.com> 
> > wrote: 
> > > 
> > > > Hi guys, 
> > > > 
> > > > Forwarding the message to the dev list as ive not had much reply in 
> the 
> > > > users list. 
> > > > 
> > > > In summary. after upgrading from ASC4.4.2 ro 4.5.1 i started having 
> > > > migration issues with a lot of vms. some vms are successfully 
> migrating 
> > > and 
> > > > others are not . 
> > > > 
> > > > The logs are shown below 
> > > > 
> > > > could someone help me to get to the bottom of this problem? 
> > > > 
> > > > Thanks 
> > > > 
> > > > Andrei 
> > > > 
> > > > 
> > > > 
> > > > ----- Forwarded Message ----- 
> > > > From: "Andrei Mikhailovsky" <and...@arhont.com> 
> > > > To: users@cloudstack.apache.org 
> > > > Sent: Wednesday, 13 May, 2015 10:44:29 AM 
> > > > Subject: Re: ACS 4.5.1 KVM live migration problem 
> > > > 
> > > > Hi Rohit, 
> > > > 
> > > > forgot to answer you on the cloud.vlan table. 
> > > > 
> > > > That particular vm has a network with vlan id 1151 as shown when i 
> look 
> > > at 
> > > > the network details in the acs gui. However, this vlan is not shown 
> in 
> > > the 
> > > > cloud.vlan table. From what I can see the cloud.vlan table shows only 
> > the 
> > > > public and management network vlan interfaces and does not show the 
> > guest 
> > > > network vlans. 
> > > > 
> > > > In terms of the public network vlan which is used for routing traffic 
> > to 
> > > > the internet from this particular vm, it is: 
> > > > 
> > > > 
> > > > mysql> select * from vlan where id=12; 
> > > > 
> > > > 
> > > 
> > 
> +----+--------------------------------------+-------------+---------------+-----------------+-------------------------------+----------------+----------------+------------+---------------------+-------------+----------+-----------+---------+---------+
>  
> > > > | id | uuid | vlan_id | vlan_gateway | vlan_netmask | description | 
> > > > vlan_type | data_center_id | network_id | physical_network_id | 
> > > ip6_gateway 
> > > > | ip6_cidr | ip6_range | removed | created | 
> > > > 
> > > > 
> > > 
> > 
> +----+--------------------------------------+-------------+---------------+-----------------+-------------------------------+----------------+----------------+------------+---------------------+-------------+----------+-----------+---------+---------+
>  
> > > > | 12 | d13ea4b3-2087-4376-9d0a-f54efe2a55af | vlan://2030 | 
> > 178.XXX.XXX.1 
> > > > | 255.255.255.128 | 178.XXX.XXX.2-178.XXX.XXX.119 | VirtualNetwork | 
> 1 
> > | 
> > > > 200 | 200 | NULL | NULL | NULL | NULL | NULL | 
> > > > 
> > > > 
> > > 
> > 
> +----+--------------------------------------+-------------+---------------+-----------------+-------------------------------+----------------+----------------+------------+---------------------+-------------+----------+-----------+---------+---------+
>  
> > > > 1 row in set (0.00 sec) 
> > > > 
> > > > 
> > > > Hope that helps 
> > > > 
> > > > Andrei 
> > > > ----- Original Message ----- 
> > > > 
> > > > From: "Rohit Yadav" <rohit.ya...@shapeblue.com> 
> > > > To: users@cloudstack.apache.org 
> > > > Sent: Wednesday, 13 May, 2015 8:55:55 AM 
> > > > Subject: Re: ACS 4.5.1 KVM live migration problem 
> > > > 
> > > > Hi Andrei, 
> > > > 
> > > > This looks like an issue similar to 
> > > > https://issues.apache.org/jira/browse/CLOUDSTACK-6893 
> > > > Can share the row from your cloud.vlan table and value of “select 
> > > > cache_mode from volume_view where vm_id=<put the vm id here>\G;" for 
> > the 
> > > VM 
> > > > causing the NPE? 
> > > > 
> > > > > On 12-May-2015, at 10:51 pm, Andrei Mikhailovsky < 
> and...@arhont.com> 
> > > > wrote: 
> > > > > 
> > > > > 
> > > > > 
> > > > > It seems that the problem is worse than i've initially thought. In 
> > > fact, 
> > > > I can't migrate most of my vms apart from a handful and I can't 
> > > determine a 
> > > > correlation between the migrateable vms and once that produce 
> > exception. 
> > > > > 
> > > > > Thanks for any help. 
> > > > > 
> > > > > Andrei 
> > > > > 
> > > > > ----- Original Message ----- 
> > > > > 
> > > > > From: "Andrei Mikhailovsky" <and...@arhont.com> 
> > > > > To: users@cloudstack.apache.org 
> > > > > Sent: Tuesday, 12 May, 2015 8:53:16 PM 
> > > > > Subject: ACS 4.5.1 KVM live migration problem 
> > > > > 
> > > > > Hi, 
> > > > > 
> > > > > I am having an issue migrating some of vms after recently upgrading 
> > to 
> > > > ACS 4.5.1. I am running Ubuntu 14.04 on both host and management 
> > servers. 
> > > > Here is the output from the log file on a client agent : 
> > > > > 
> > > > > 
> > > > > 2015-05-12 20:42:34,154 DEBUG 
> [kvm.resource.LibvirtComputingResource] 
> > > > (agentRequest-Handler-1:null) Preparing host for migrating 
> > > > com.cloud.agent.api.to.VirtualMachineTO@21a038ac 
> > > > > 2015-05-12 20:42:34,157 DEBUG [kvm.resource.LibvirtConnection] 
> > > > (agentRequest-Handler-1:null) can't find connection: KVM, for vm: 
> > > > i-9-1162-VM, continue 
> > > > > 2015-05-12 20:42:34,159 DEBUG [kvm.resource.LibvirtConnection] 
> > > > (agentRequest-Handler-1:null) can't find connection: LXC, for vm: 
> > > > i-9-1162-VM, continue 
> > > > > 2015-05-12 20:42:34,159 DEBUG [kvm.resource.LibvirtConnection] 
> > > > (agentRequest-Handler-1:null) can't find which hypervisor the vm 
> used , 
> > > > then use the default hypervisor 
> > > > > 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver] 
> > > > (agentRequest-Handler-1:null) 
> > nic=[Nic:Guest-178.248.108.205-vlan://2014] 
> > > > > 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver] 
> > > > (agentRequest-Handler-1:null) creating a vNet dev and bridge for 
> guest 
> > > > traffic per traffic label cloudstackbr0 
> > > > > 2015-05-12 20:42:34,160 DEBUG [kvm.resource.BridgeVifDriver] 
> > > > (agentRequest-Handler-1:null) Executing: 
> > > > /usr/share/cloudstack-common/scripts/vm/network/vnet/modifyvlan.sh -v 
> > > 2014 
> > > > -p bond0 -b brbond0-2014 -o add 
> > > > > 2015-05-12 20:42:34,211 DEBUG [kvm.resource.BridgeVifDriver] 
> > > > (agentRequest-Handler-1:null) Execution is successful. 
> > > > > 2015-05-12 20:42:34,211 DEBUG [kvm.resource.BridgeVifDriver] 
> > > > (agentRequest-Handler-1:null) nic=[Nic:Guest-10.1.1.66-null] 
> > > > > 2015-05-12 20:42:34,212 DEBUG [kvm.storage.KVMStoragePoolManager] 
> > > > (agentRequest-Handler-1:null) Disconnecting disk 
> > > > 23add201-e4ee-447b-a448-ecd152aea4ad 
> > > > > 2015-05-12 20:42:34,212 DEBUG [kvm.storage.LibvirtStorageAdaptor] 
> > > > (agentRequest-Handler-1:null) Trying to fetch storage pool 
> > > > cf771bc7-8998-354d-8e10-5564585a3c20 from libvirt 
> > > > > 2015-05-12 20:42:34,223 DEBUG [kvm.storage.KVMStoragePoolManager] 
> > > > (agentRequest-Handler-1:null) Disconnecting disk 
> > > > 55100d25-410e-4fa3-a38b-7717f74d2afe 
> > > > > 2015-05-12 20:42:34,223 DEBUG [kvm.storage.LibvirtStorageAdaptor] 
> > > > (agentRequest-Handler-1:null) Trying to fetch storage pool 
> > > > cf771bc7-8998-354d-8e10-5564585a3c20 from libvirt 
> > > > > 2015-05-12 20:42:34,232 DEBUG [kvm.storage.KVMStoragePoolManager] 
> > > > (agentRequest-Handler-1:null) Disconnecting disk 
> > > > 2db59d16-d17f-49a1-b913-7fbe4025a549 
> > > > > 2015-05-12 20:42:34,233 DEBUG [kvm.storage.LibvirtStorageAdaptor] 
> > > > (agentRequest-Handler-1:null) Trying to fetch storage pool 
> > > > cf771bc7-8998-354d-8e10-5564585a3c20 from libvirt 
> > > > > 2015-05-12 20:42:34,243 DEBUG [kvm.storage.KVMStoragePoolManager] 
> > > > (agentRequest-Handler-1:null) Disconnecting disk 
> > > > 17afbf31-ac89-46f7-a2c8-f8aed796e4c6 
> > > > > 2015-05-12 20:42:34,243 DEBUG [kvm.storage.LibvirtStorageAdaptor] 
> > > > (agentRequest-Handler-1:null) Trying to fetch storage pool 
> > > > d8d5ec36-3cb0-39af-8fc6-084a4abd5d28 from libvirt 
> > > > > 2015-05-12 20:42:34,254 WARN [cloud.agent.Agent] 
> > > > (agentRequest-Handler-1:null) Caught: 
> > > > > java.lang.NullPointerException 
> > > > > at 
> > > > 
> > > 
> > 
> com.cloud.network.Networks$BroadcastDomainType.getSchemeValue(Networks.java:172)
>  
> > > > > at 
> > > > 
> > > 
> > 
> com.cloud.network.Networks$BroadcastDomainType.getValue(Networks.java:226) 
> > > > > at 
> > > > 
> > > 
> > 
> com.cloud.hypervisor.kvm.resource.BridgeVifDriver.plug(BridgeVifDriver.java:105)
>  
> > > > > at 
> > > > 
> > > 
> > 
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:3230)
>  
> > > > > at 
> > > > 
> > > 
> > 
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1307)
>  
> > > > > at com.cloud.agent.Agent.processRequest(Agent.java:503) 
> > > > > at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:808) 
> > > > > at com.cloud.utils.nio.Task.run(Task.java:84) 
> > > > > at 
> > > > 
> > > 
> > 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  
> > > > > at 
> > > > 
> > > 
> > 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  
> > > > > at java.lang.Thread.run(Thread.java:745) 
> > > > > 2015-05-12 20:42:34,256 DEBUG [cloud.agent.Agent] 
> > > > (agentRequest-Handler-1:null) Seq 7-7525233502359390941: { Ans: , 
> > MgmtId: 
> > > > 115129173025118, via: 7, Ver: v1, Flags: 110, 
> > > > 
> > > 
> > 
> [{"com.cloud.agent.api.Answer":{"result":false,"details":"java.lang.NullPointerException\n\tat
>  
> > > > 
> > > 
> > 
> com.cloud.network.Networks$BroadcastDomainType.getSchemeValue(Networks.java:172)\n\tat
>  
> > > > 
> > > 
> > 
> com.cloud.network.Networks$BroadcastDomainType.getValue(Networks.java:226)\n\tat
>  
> > > > 
> > > 
> > 
> com.cloud.hypervisor.kvm.resource.BridgeVifDriver.plug(BridgeVifDriver.java:105)\n\tat
>  
> > > > 
> > > 
> > 
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.execute(LibvirtComputingResource.java:3230)\n\tat
>  
> > > > 
> > > 
> > 
> com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1307)\n\tat
>  
> > > > com.cloud.agent.Agent.processRequest(Agent.java:503)\n\tat 
> > > > 
> com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:808)\n\tat 
> > > > com.cloud.utils.nio.Task.run(Task.java:84)\n\tat 
> > > > 
> > > 
> > 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\n\tat
>  
> > > > 
> > > 
> > 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\n\tat
>  
> > > > java.lang.Thread.run(Thread.java:745)\n","wait":0}}] } 
> > > > > 
> > > > > 
> > > > > 
> > > > > Any idea how to get this fixed? Not sure why all of a sudden the 
> > > > migration stopped working for a handful of vms. I can successfully 
> > > migrate 
> > > > some vms, but not others. 
> > > > > 
> > > > > Thanks 
> > > > > 
> > > > > Andrei 
> > > > > 
> > > > > 
> > > > 
> > > > Regards, 
> > > > Rohit Yadav 
> > > > Software Architect, ShapeBlue 
> > > > M. +91 88 262 30892 | rohit.ya...@shapeblue.com 
> > > > Blog: bhaisaab.org | Twitter: @_bhaisaab 
> > > > 
> > > > 
> > > > 
> > > > Find out more about ShapeBlue and our range of CloudStack related 
> > > services 
> > > > 
> > > > IaaS Cloud Design & Build< 
> > > > http://shapeblue.com/iaas-cloud-design-and-build//> 
> > > > CSForge – rapid IaaS deployment framework< 
> > http://shapeblue.com/csforge/> 
> > > > CloudStack Consulting<http://shapeblue.com/cloudstack-consultancy/> 
> > > > CloudStack Software Engineering< 
> > > > http://shapeblue.com/cloudstack-software-engineering/> 
> > > > CloudStack Infrastructure Support< 
> > > > http://shapeblue.com/cloudstack-infrastructure-support/> 
> > > > CloudStack Bootcamp Training Courses< 
> > > > http://shapeblue.com/cloudstack-training/> 
> > > > 
> > > > This email and any attachments to it may be confidential and are 
> > intended 
> > > > solely for the use of the individual to whom it is addressed. Any 
> views 
> > > or 
> > > > opinions expressed are solely those of the author and do not 
> > necessarily 
> > > > represent those of Shape Blue Ltd or related companies. If you are 
> not 
> > > the 
> > > > intended recipient of this email, you must neither take any action 
> > based 
> > > > upon its contents, nor copy or show it to anyone. Please contact the 
> > > sender 
> > > > if you believe you have received this email in error. Shape Blue Ltd 
> > is a 
> > > > company incorporated in England & Wales. ShapeBlue Services India LLP 
> > is 
> > > a 
> > > > company incorporated in India and is operated under license from 
> Shape 
> > > Blue 
> > > > Ltd. Shape Blue Brasil Consultoria Ltda is a company incorporated in 
> > > Brasil 
> > > > and is operated under license from Shape Blue Ltd. ShapeBlue SA Pty 
> Ltd 
> > > is 
> > > > a company registered by The Republic of South Africa and is traded 
> > under 
> > > > license from Shape Blue Ltd. ShapeBlue is a registered trademark. 
> > > > 
> > > > 
> > > 
> > 
> > 
> > 
> > -- 
> > 
> > Andrija Panić 
> > 
> > 
> 
> 
> -- 
> 
> Andrija Panić 
> 
> 


-- 

Andrija Panić 

Reply via email to