Yes, Kambiz, you followed up right, and vm id=15 is the culprit. If vm
id=15 is expunged, we have to clear out the reference to it from
user_ip_address table. Here is the flow:

1) Save the db dump.
2) Run the query to cleanup the reference:

Update user_ip_address set one_to_one_nat=0, instance_id=null where
id=<problematic public ip address id>

Let me know how it works.


On 3/24/14, 10:55 AM, "Kambiz Darabi" <> wrote:

>I hope I have understood what you wrote and created the following query
>select uip.vm_id, uip.network_id, uip.public_ip_address,
>       n.state as nic_state, n.removed as nic_removed,
>       vm.state as vm_state, vm.removed as vm_removed
>from user_ip_address uip
>     join nics n on uip.vm_id = n.instance_id
>     join vm_instance vm on uip.vm_id =
>where in (Select ip_address_id from firewall_rules fr where
>| vm_id | network_id | public_ip_address | nic_state    | nic_removed
>    | vm_state  | vm_removed |
>|     6 |        205 |     | Allocated    | NULL
>    | Stopped   | NULL       |
>|    10 |        205 |     | Allocated    | NULL
>    | Stopped   | NULL       |
>|    12 |        205 |     | Allocated    | NULL
>    | Stopped   | NULL       |
>|    13 |        205 |     | Allocated    | NULL
>    | Stopped   | NULL       |
>|    14 |        205 |     | Allocated    | NULL
>    | Stopped   | NULL       |
>|    15 |        205 |     | Deallocating | 2014-03-18
>23:00:53 | Expunging | NULL       |
>|    16 |        205 |     | Allocated    | NULL
>    | Stopped   | NULL       |
>Is VM id 15 what you are looking for?
>Thank you
>Alena Prokharchyk <> wrote:
>> Kambiz, can you please try one more thing.
>> 1) Locate all the firewall rules for your guest network (205, right?)
>> Select id, ip_address_id from firewall_rules where network_id=205;
>> 2) Now get all static nat enabled ip addresses for those rules:
>> Select vm_id, network_id from user_ip_address where id in (Select id,
>> ip_address_id from firewall_rules where network_id=205);
>> For each vmId/networkId combo, check if there is non-removed nic and
>> non-expunged vm. There might be some incorrect static nat ip/vm
>> referring to vm that is removed already. If you find any, let me know
>> I will tell you how to clean it up
>> -Alena.
>> On 3/22/14, 5:41 AM, "Kambiz Darabi" <> wrote:
>>>Hi Alena,
>>>thank you for your help.
>>>The query returns no rows, i.e. nics.removed was not null, but I removed
>>>the row though to see what happens: a new virtual router was created
>>>which also couldn't be started due to the same NPE. I reverted the
>>>change by restoring from the dump.
>>>I have to mention that prior to the restart, r-7-VM was the router which
>>>was used by my instances. I deleted the router using the UI after the
>>>occurrence of the NPE, because a post with a similar problem suggested
>>>that the deleted router would be recreated again (and this procedure
>>>solved the problem).
>>>Below I have attached the state of the two tables.
>>>Anything else I can try?
>>>Thank you
>>>mysql> select, n.removed, n.ip4_address, n.netmask, n.gateway,
>>>n.ip_type, n.reserver_name, n.network_id, as instance_id,,
>>>i.state, i.type from vm_instance i join nics n on n.instance_id =
>>>where i.type = 'DomainRouter';
>>>| id | removed             | ip4_address   | netmask       | gateway
>>>| ip_type | reserver_name            | network_id | instance_id | name
>>>| state     | type         |
>>>|  9 | 2014-03-17 11:27:58 |   | | NULL
>>>| NULL    | ExternalGuestNetworkGuru |        204 |           4 | r-4-VM
>>>| Expunging | DomainRouter |
>>>| 10 | 2014-03-17 11:27:58 | NULL          | NULL          | NULL
>>>| NULL    | ControlNetworkGuru       |        202 |           4 | r-4-VM
>>>| Expunging | DomainRouter |
>>>| 11 | 2014-03-17 11:27:58 | | |
>>>| NULL    | PublicNetworkGuru        |        200 |           4 | r-4-VM
>>>| Expunging | DomainRouter |
>>>| 14 | 2014-03-17 11:27:52 |   | | NULL
>>>| NULL    | ExternalGuestNetworkGuru |        205 |           7 | r-7-VM
>>>| Expunging | DomainRouter |
>>>| 15 | 2014-03-17 11:27:52 | NULL          | NULL          | NULL
>>>| NULL    | ControlNetworkGuru       |        202 |           7 | r-7-VM
>>>| Expunging | DomainRouter |
>>>| 16 | 2014-03-17 11:27:52 | | |
>>>| NULL    | PublicNetworkGuru        |        200 |           7 | r-7-VM
>>>| Expunging | DomainRouter |
>>>| 26 | 2014-03-18 08:11:16 |   | | NULL
>>>| NULL    | ExternalGuestNetworkGuru |        205 |          18 |
>>>| Expunging | DomainRouter |
>>>| 27 | 2014-03-18 08:11:16 | NULL          | NULL          | NULL
>>>| NULL    | ControlNetworkGuru       |        202 |          18 |
>>>| Expunging | DomainRouter |
>>>| 28 | 2014-03-18 08:11:16 | | |
>>>| NULL    | PublicNetworkGuru        |        200 |          18 |
>>>| Expunging | DomainRouter |
>>>| 29 | NULL                |   | | NULL
>>>| NULL    | ExternalGuestNetworkGuru |        205 |          19 |
>>>| Stopped   | DomainRouter |
>>>| 30 | NULL                | NULL          | NULL          | NULL
>>>| NULL    | ControlNetworkGuru       |        202 |          19 |
>>>| Stopped   | DomainRouter |
>>>| 31 | NULL                | | |
>>>| NULL    | PublicNetworkGuru        |        200 |          19 |
>>>| Stopped   | DomainRouter |
>>>mysql> select * from router_network_ref;
>>>| id | router_id | network_id | guest_type |
>>>|  1 |         4 |        204 | Isolated   |
>>>|  2 |         7 |        205 | Isolated   |
>>>|  3 |        18 |        205 | Isolated   |
>>>|  4 |        19 |        205 | Isolated   |
>>>Alena Prokharchyk <> wrote:
>>>> The error happens not because Ip is null, but because the nic in a
>>>> network can¹t be found. Looks like there is some bug in VPC nic
>>>> plug/unplug for Guest networks process.
>>>> Kambiz, please do the following to fix it:
>>>> 1) Stop the MS
>>>> 2) Take the DB dump of cloud db in case  you have to revert back.
>>>> 3) Run the query:
>>>> select * from router_network_ref where router_id=<id of your VR) and
>>>> network_id not in (select network_id from nics where instance_id=<ID
>>>> your VR> and removed is null);
>>>> It will give you the list of networks refs that somehow weren¹t
>>>> during the nic detach. Remove the entry returned from
>>>> table.
>>>> Let me know how it works.
>>>> -Alena.
>>>> On 3/21/14, 3:36 PM, "Kambiz Darabi" <> wrote:
>>>>>as this is my first post to the list, I would like to thank all
>>>>>contributors for Cloudstack which I use since last fall without any
>>>>>problems. I run 4.1.1 with KVM and advanced networking.
>>>>>After a restart of the management server (stopping and starting the
>>>>>process), the virtual domain router doesn't start and
>>>>>management-server.log shows a NullPointerException in
>>>>>NetworkModelImpl.getIpInNetwork (cf. stack trace below).
>>>>>By putting the server in debug mode and remote debugging, I found out
>>>>>that the reason is a row in the table nics which has NULL in ip (cf.
>>>>>with id 30 in the result of the select statement below).
>>>>>What can I do to quickly solve this problem? Any pointers or
>>>>>are appreciated as the system is currently unusable.
>>>>>Thank you for your help
>>>>>2014-03-18 10:03:27,151 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking VirtualRouter to prepare for
>>>>>2014-03-18 10:03:27,151 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking Ovs to prepare for
>>>>>2014-03-18 10:03:27,151 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking SecurityGroupProvider to prepare for
>>>>>2014-03-18 10:03:27,151 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking VpcVirtualRouter to prepare for
>>>>>2014-03-18 10:03:27,151 WARN
>>>>>(Job-Executor-1:job-176) Network Ntwk[205|Guest|8] is not associated
>>>>>any VPC
>>>>>2014-03-18 10:03:27,151 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking NiciraNvp to prepare for
>>>>>2014-03-18 10:03:27,151 DEBUG [network.element.NiciraNvpElement]
>>>>>(Job-Executor-1:job-176) Checking if NiciraNvpElement can handle
>>>>>Connectivity on network net1
>>>>>2014-03-18 10:03:27,153 DEBUG []
>>>>>(Job-Executor-1:job-176) Service SecurityGroup is not supported in the
>>>>>network id=205
>>>>>2014-03-18 10:03:27,156 DEBUG []
>>>>>(Job-Executor-1:job-176) Lock is acquired for network id 202 as a part
>>>>>network implement
>>>>>2014-03-18 10:03:27,156 DEBUG []
>>>>>(Job-Executor-1:job-176) Network id=202 is already implemented
>>>>>2014-03-18 10:03:27,157 DEBUG []
>>>>>(Job-Executor-1:job-176) Lock is released for network id 202 as a part
>>>>>network implement
>>>>>2014-03-18 10:03:27,187 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking VirtualRouter to prepare for
>>>>>2014-03-18 10:03:27,187 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking Ovs to prepare for
>>>>>2014-03-18 10:03:27,187 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking SecurityGroupProvider to prepare for
>>>>>2014-03-18 10:03:27,187 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking VpcVirtualRouter to prepare for
>>>>>2014-03-18 10:03:27,187 WARN
>>>>>(Job-Executor-1:job-176) Network Ntwk[202|Control|3] is not associated
>>>>>with any VPC
>>>>>2014-03-18 10:03:27,188 DEBUG []
>>>>>(Job-Executor-1:job-176) Asking NiciraNvp to prepare for
>>>>>2014-03-18 10:03:27,188 DEBUG [network.element.NiciraNvpElement]
>>>>>(Job-Executor-1:job-176) Checking if NiciraNvpElement can handle
>>>>>Connectivity on network null
>>>>>2014-03-18 10:03:27,190 DEBUG []
>>>>>(Job-Executor-1:job-176) Checking if we need to prepare 1 volumes for
>>>>>2014-03-18 10:03:27,190 DEBUG []
>>>>>(Job-Executor-1:job-176) No need to recreate the volume:
>>>>>Vol[24|vm=19|ROOT], since it already has a pool assigned: 200, adding
>>>>>disk to VM
>>>>>2014-03-18 10:03:27,224 DEBUG
>>>>>(Job-Executor-1:job-176) Boot Args for VM[DomainRouter|r-19-VM]:
>>>>>template=domP name=r-19-VM eth2ip= eth2mask=
>>>>>gateway= eth0ip= eth0mask=
>>>>>domain=cs6cloud.internal dhcprange= eth0ip=
>>>>>eth0mask= type=router disable_rp_filter=true
>>>>>2014-03-18 10:03:27,343 DEBUG
>>>>>(Job-Executor-1:job-176) Found 8 ip(s) to apply as a part of domR
>>>>>VM[DomainRouter|r-19-VM] start.
>>>>>2014-03-18 10:03:27,415 DEBUG
>>>>>(Job-Executor-1:job-176) Resending ipAssoc, port forwarding, load
>>>>>balancing rules as a part of Virtual router start
>>>>>2014-03-18 10:03:27,499 DEBUG
>>>>>(Job-Executor-1:job-176) Found 12 firewall Egress rule(s) to apply as
>>>>>part of domR VM[DomainRouter|r-19-VM] start.
>>>>>2014-03-18 10:03:27,593 ERROR [cloud.vm.VirtualMachineManagerImpl]
>>>>>(Job-Executor-1:job-176) Failed to start instance
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>   at 
>>>>>table nics:
>>>>>mysql> select * from nics where reserver_name = 'ControlNetworkGuru';
>>>>>| id | uuid                                 | instance_id |
>>>>>    | ip4_address   | netmask     | gateway     | ip_type |
>>>>>| network_id | mode   | state        | strategy | reserver_name      |
>>>>>reservation_id                       | device_id | update_time
>>>>> |
>>>>>isolation_uri | ip6_address | default_nic | vm_type            |
>>>>>           | removed             | ip6_gateway | ip6_cidr |
>>>>>|  2 | 289aacb8-cfd7-4879-a632-6cfbda36cbf4 |           1 |
>>>>>0e:00:a9:fe:00:55 |  | | | Ip4
>>>>>NULL          |        202 | Static | Reserved     | Start    |
>>>>>ControlNetworkGuru | 993864b4-9dde-47d6-8fd6-cf94050442c6 |         0
>>>>>2014-03-17 22:21:38 | NULL          | NULL        |           0 |
>>>>>SecondaryStorageVm | 2013-09-06 12:44:42 | NULL                | NULL
>>>>>   | NULL     |
>>>>>|  6 | 5fdf4b1a-b90c-4c79-9d42-9eaf87eaa042 |           2 |
>>>>>0e:00:a9:fe:02:d3 | | | | Ip4
>>>>>NULL          |        202 | Static | Reserved     | Start    |
>>>>>ControlNetworkGuru | 852e0a65-c72a-448f-ac71-2bb3549a5a41 |         0
>>>>>2014-03-17 22:21:38 | NULL          | NULL        |           0 |
>>>>>ConsoleProxy       | 2013-09-06 12:44:42 | NULL                | NULL
>>>>>   | NULL     |
>>>>>| 10 | 4c4e6368-95d7-419a-a9b3-a5bb394197f0 |           4 | NULL
>>>>>    | NULL          | NULL        | NULL        | NULL    | NULL
>>>>>|        202 | Static | Deallocating | Start    | ControlNetworkGuru |
>>>>>c28e8ddc-c106-462e-96c8-5d5216dad9b7 |         1 | 2014-03-17
>>>>>12:27:58 |
>>>>>NULL          | NULL        |           0 | DomainRouter       |
>>>>>2013-09-10 08:08:39 | 2014-03-17 11:27:58 | NULL        | NULL     |
>>>>>| 15 | 1f2e99c0-9cd9-47aa-ab10-f190efd7a2dc |           7 | NULL
>>>>>    | NULL          | NULL        | NULL        | NULL    | NULL
>>>>>|        202 | Static | Deallocating | Start    | ControlNetworkGuru |
>>>>>ca1aa99e-e630-4533-9642-523d8a8b1fea |         1 | 2014-03-17
>>>>>12:27:52 |
>>>>>NULL          | NULL        |           0 | DomainRouter       |
>>>>>2013-09-12 10:58:03 | 2014-03-17 11:27:52 | NULL        | NULL     |
>>>>>| 27 | 1c98c4f2-f604-4a38-a813-f68833b1d250 |          18 | NULL
>>>>>    | NULL          | NULL        | NULL        | NULL    | NULL
>>>>>|        202 | Static | Deallocating | Start    | ControlNetworkGuru |
>>>>>ad8e0e50-72aa-4c68-8634-8dc89f12fe01 |         1 | 2014-03-18
>>>>>09:11:16 |
>>>>>NULL          | NULL        |           0 | DomainRouter       |
>>>>>2014-03-17 11:28:50 | 2014-03-18 08:11:16 | NULL        | NULL     |
>>>>>| 30 | cabd4cd9-c39f-423f-ad6a-ee3affe0bd9d |          19 | NULL
>>>>>    | NULL          | NULL        | NULL        | NULL    | NULL
>>>>>|        202 | Static | Allocated    | Start    | ControlNetworkGuru |
>>>>>e81ba56d-a101-4c60-b44f-a0890d56aad9 |         1 | 2014-03-18
>>>>>09:11:44 |
>>>>>NULL          | NULL        |           0 | DomainRouter       |
>>>>>2014-03-18 08:11:32 | NULL                | NULL        | NULL     |

Reply via email to