I wanted to follow up on this a bit. I've been able to duplicate this
multiple times now (and resolve the issue myself).

First the resolution: setting the "removed" timestamp in the nics table for
the placeholder entry allows me to delete the network. My suspicion without
looking at the code is that the check for entries to cleanup is partially
filtering by results without a "removed" timestamp. I haven't verified this
in the code.

As for duplicating it, when I created a shared network the vrouter isn't
immediately started until I either launch into it or "restart network".
Sane behaviour and I recall this being tunable. Regardless, I'm seeing
weird vrouter provisioning issues where it XenServer isn't apparently
getting the message to start the instance. Eventually the vrouter is left
in a stopped state. If I start it then it starts immediately. This MAY be
something I need to troubleshoot more on my side but I want to put it aside
for a moment.

Once I start the vrouter (and I can launch instances fine into that shared
network and get out), I still cannot delete the network regardless of any
instances in it or vrouter being destroyed or not. If there are no
instances and I attempt to delete the network, the vrouter is being
terminated fine but that last step I documented previously is still
happening until I make the aforementioned change in the nics table.


On Tue, Jan 7, 2014 at 12:22 PM, John Vincent <cloudstack-us...@lusis.org>wrote:

> I brought this issue up on irc but figured it was a good idea to bring it
> up here as well. I'm running into an issue deleting a shared network. This
> is on CS 4.2.0. I have two shared networks exibiting the same behavior. At
> this point I'd like to clean them up in the database if possible but if
> something is off with the nics table (as I suspect), I'd be willing to fix
> that as well.
>
> From the logs it appears that the NPE is in the cleanup step:
>
> 2014-01-07 04:48:15,111 DEBUG [network.lb.LoadBalancingRulesManagerImpl] 
> (Job-Executor-26:job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ]) Found 0 
> lb rules to cleanup
> 2014-01-07 04:48:15,111 DEBUG [cloud.network.NetworkManagerImpl] 
> (Job-Executor-26:job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ]) Cleaning 
> up remote access vpns as a part of
>  public IP id=4 release...
> 2014-01-07 04:48:15,122 DEBUG [network.vpn.RemoteAccessVpnManagerImpl] 
> (Job-Executor-26:job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ]) there 
> are no Remote access vpns for p
> ublic ip address id=4
> 2014-01-07 04:48:15,132 DEBUG [cloud.network.NetworkManagerImpl] 
> (Job-Executor-26:job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ]) Sending 
> destroy to com.cloud.network.elemen
> t.VirtualRouterElement_EnhancerByCloudStack_c958fdcb@31e924f9
> 2014-01-07 04:48:15,133 DEBUG [cloud.network.NetworkManagerImpl] 
> (Job-Executor-26:job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ]) Network 
> id=207 is destroyed successfully, c
> leaning up corresponding resources now.
> 2014-01-07 04:48:15,138 DEBUG [network.guru.DirectNetworkGuru] 
> (Job-Executor-26:job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ]) 
> Releasing ip 172.16.0.1 of placeholder nic Ni
> c[54-null-null-172.16.0.1]
> 2014-01-07 04:48:15,140 DEBUG [db.Transaction.Transaction] 
> (Job-Executor-26:job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ]) Rolling 
> back the transaction: Time = 5 Name =  -A
> syncJobManagerImpl$1.run:494-Executors$RunnableAdapter.call:471-FutureTask$Sync.innerRun:334-FutureTask.run:166-ThreadPoolExecutor.runWorker:1145-ThreadPoolExecutor$Worker.run:6
> 15-Thread.run:724; called by 
> -Transaction.rollback:898-Transaction.removeUpTo:841-Transaction.close:665-TransactionContextBuilder.interceptException:63-ComponentInstantiationPos
> tProcessor$InterceptorDispatcher.intercept:133-NetworkManagerImpl.destroyNetwork:3131-ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept:125-NetworkServiceImpl.
> deleteNetwork:1767-ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept:125-DeleteNetworkCmd.execute:70-ApiDispatcher.dispatch:158-AsyncJobManagerImpl$1.run:531
> 2014-01-07 04:48:15,146 ERROR [cloud.async.AsyncJobManagerImpl] 
> (Job-Executor-26:job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ]) 
> Unexpected
> exception while executing 
> org.apache.cloudstack.api.command.user.network.DeleteNetworkCmd
> java.lang.NullPointerException
>         at 
> com.cloud.network.guru.DirectNetworkGuru.trash(DirectNetworkGuru.java:311)
>         at 
> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>         at 
> com.cloud.network.NetworkManagerImpl.destroyNetwork(NetworkManagerImpl.java:3131)
>         at 
> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>         at 
> com.cloud.network.NetworkServiceImpl.deleteNetwork(NetworkServiceImpl.java:1767)
>         at 
> com.cloud.utils.component.ComponentInstantiationPostProcessor$InterceptorDispatcher.intercept(ComponentInstantiationPostProcessor.java:125)
>         at 
> org.apache.cloudstack.api.command.user.network.DeleteNetworkCmd.execute(DeleteNetworkCmd.java:70)
>         at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:158)
>         at 
> com.cloud.async.AsyncJobManagerImpl$1.run(AsyncJobManagerImpl.java:531)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:724)
> 2014-01-07 04:48:15,149 DEBUG [cloud.async.AsyncJobManagerImpl] 
> (Job-Executor-26:job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ]) Complete 
> async job-320 = [ ceb8c1fb-cb28-42b6-b9f6-6d82f760a689 ], jobStatus: 2, 
> resultCode: 530, result: Error Code: 530 Error text: null
>
>
> I've tried this with and without the virtual router running for that network. 
> Here are the entries from the nic table for that network right now (with the 
> VR running):
>
>
> mysql> select id,instance_id,reservation_id,ip4_address,state,strategy from 
> nics where network_id = '208';
> +-----+-------------+----------------+--------------+--------------+-------------+
> | id  | instance_id | reservation_id | ip4_address  | state        | strategy 
>    |
> +-----+-------------+----------------+--------------+--------------+-------------+
> |  58 |          26 | NULL           | 172.16.1.79  | Deallocating | Create   
>    |
> |  60 |        NULL | NULL           | 172.16.1.1   | Reserved     | 
> PlaceHolder |
> |  61 |          27 | NULL           | 172.16.1.1   | Deallocating | Create   
>    |
> |  64 |          28 | NULL           | 172.16.1.45  | Deallocating | Create   
>    |
> |  73 |          31 | NULL           | 172.16.1.141 | Deallocating | Create   
>    |
> |  77 |          33 | NULL           | 172.16.1.1   | Deallocating | Create   
>    |
> | 147 |          85 | NULL           | 172.16.1.1   | Deallocating | Create   
>    |
> | 150 |          86 | NULL           | 172.16.1.1   | Reserved     | Create   
>    |
> +-----+-------------+----------------+--------------+--------------+-------------+
> 8 rows in set (0.00 sec)
>
>
> Here's the entry from the networks table for it:
>
>
> mysql> select * from networks where id = 208\G
> *************************** 1. row ***************************
>                    id: 208
>                  name: prod-be-network
>                  uuid: be81f804-f75d-4c8f-89af-9800b3c3f328
>          display_text: Production Backend Network
>          traffic_type: Guest
> broadcast_domain_type: Vlan
>         broadcast_uri: vlan://1121
>               gateway: 172.16.1.1
>                  cidr: 172.16.1.0/24
>                  mode: Dhcp
>   network_offering_id: 19
>   physical_network_id: 202
>        data_center_id: 1
>             guru_name: DirectNetworkGuru
>                 state: Implementing
>               related: 208
>             domain_id: 1
>            account_id: 1
>                  dns1: 8.8.8.8
>                  dns2: NULL
>             guru_data: NULL
>            set_fields: 0
>              acl_type: Domain
>        network_domain: cs1cloud.internal
>        reservation_id: eed95258-e478-49ed-a5ef-ffa5460d320d
>            guest_type: Shared
>      restart_required: 0
>               created: 2013-11-20 16:00:26
>               removed: NULL
>     specify_ip_ranges: 1
>                vpc_id: NULL
>           ip6_gateway: NULL
>              ip6_cidr: NULL
>          network_cidr: NULL
>       display_network: 1
>        network_acl_id: NULL
> 1 row in set (0.00 sec)
>
>
> For comparison, here's the entry from nics and networks for the other network 
> that won't delete (207)
>
>
> nics:
>
> mysql> select id,instance_id,reservation_id,ip4_address,state,strategy from 
> nics where network_id = '207';
> +-----+-------------+----------------+--------------+--------------+-------------+
> | id  | instance_id | reservation_id | ip4_address  | state        | strategy 
>    |
> +-----+-------------+----------------+--------------+--------------+-------------+
> |  53 |          24 | NULL           | 172.16.0.23  | Deallocating | Create   
>    |
> |  54 |        NULL | NULL           | 172.16.0.1   | Reserved     | 
> PlaceHolder |
> |  55 |          25 | NULL           | 172.16.0.1   | Deallocating | Create   
>    |
> |  59 |          26 | NULL           | 172.16.0.80  | Deallocating | Create   
>    |
> |  72 |          31 | NULL           | 172.16.0.164 | Deallocating | Create   
>    |
> |  74 |          32 | NULL           | 172.16.0.1   | Deallocating | Create   
>    |
> |  80 |          34 | NULL           | 172.16.0.36  | Deallocating | Create   
>    |
> |  81 |          35 | NULL           | 172.16.0.32  | Deallocating | Create   
>    |
> |  85 |          38 | NULL           | 172.16.0.168 | Deallocating | Create   
>    |
> | 144 |          84 | NULL           | 172.16.0.1   | Deallocating | Create   
>    |
> +-----+-------------+----------------+--------------+--------------+-------------+
> 10 rows in set (0.00 sec)
>
>
> networks:
>
> mysql> select * from networks where id = 207\G
> *************************** 1. row ***************************
>                    id: 207
>                  name: prod-fe-network
>                  uuid: da81b864-7082-4c46-a425-c5c6265a1eca
>          display_text: Production frontend network
>          traffic_type: Guest
> broadcast_domain_type: Vlan
>         broadcast_uri: vlan://1120
>               gateway: 172.16.0.1
>                  cidr: 172.16.0.0/24
>                  mode: Dhcp
>   network_offering_id: 19
>   physical_network_id: 202
>        data_center_id: 1
>             guru_name: DirectNetworkGuru
>                 state: Setup
>               related: 207
>             domain_id: 1
>            account_id: 1
>                  dns1: 8.8.8.8
>                  dns2: NULL
>             guru_data: NULL
>            set_fields: 0
>              acl_type: Domain
>        network_domain: cs1cloud.internal
>        reservation_id: 81d652a1-9cb3-4833-92ca-58ee7b33c04b
>            guest_type: Shared
>      restart_required: 0
>               created: 2013-11-20 15:54:39
>               removed: NULL
>     specify_ip_ranges: 1
>                vpc_id: NULL
>           ip6_gateway: NULL
>              ip6_cidr: NULL
>          network_cidr: NULL
>       display_network: 1
>        network_acl_id: NULL
> 1 row in set (0.00 sec)
>
>

Reply via email to