Public bug reported: we are running into an unexpected situation where number of dvr routers is increasing to nearly 2000 on a compute node on which some instances got a nic on floating ip network.
We are using Queens release, neutron-common/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed,automatic] neutron-l3-agent/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed] neutron-metadata-agent/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed,automatic] neutron-openvswitch-agent/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed] python-neutron/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed,automatic] python-neutron-fwaas/xenial,xenial,now 2:12.0.1-1.0~u16.04+mcp6 all [installed,automatic] python-neutron-lib/xenial,xenial,now 1.13.0-1.0~u16.04+mcp9 all [installed,automatic] python-neutronclient/xenial,xenial,now 1:6.7.0-1.0~u16.04+mcp17 all [installed,automatic] Currently, my guess is that some applications mistakenly invokes rpc calls like this https://github.com/openstack/neutron/blob/490471ebd3ac56d0cee164b9c1c1211687e49437/neutron/api/rpc/agentnotifiers/l3_rpc_agent_api.py#L166 with dvr associated with a floating ip address on a host which has fixed ip address allocated from floating network (aka device_owner prefix with compute:). Then such router will be kept by this https://github.com/openstack/neutron/blob/490471ebd3ac56d0cee164b9c1c1211687e49437/neutron/db/l3_dvrscheduler_db.py#L427 function, because `get_subnet_ids_on_router` does not filter out router:gateway ports. I think this is a bug because as long as we do not have ports with specific device owners we should not have a dvr router on it. besides it is pretty easy to replay this bug. First create a dvr router with an external gateway on floating network Then create on virtual machine with fixed ip on floating network Then call `routers_updated_on_host` manually, then this dvr will be created on the host where vm resides on, but actually it should be there. ** Affects: neutron Importance: Undecided Assignee: norman shen (jshen28) Status: In Progress ** Description changed: we are running into an unexpected situation where number of dvr routers is increasing to nearly 2000 on a compute node on which some instances got a nic on floating ip network. We are using Queens release, neutron-common/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed,automatic] neutron-l3-agent/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed] neutron-metadata-agent/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed,automatic] neutron-openvswitch-agent/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed] python-neutron/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed,automatic] python-neutron-fwaas/xenial,xenial,now 2:12.0.1-1.0~u16.04+mcp6 all [installed,automatic] python-neutron-lib/xenial,xenial,now 1.13.0-1.0~u16.04+mcp9 all [installed,automatic] python-neutronclient/xenial,xenial,now 1:6.7.0-1.0~u16.04+mcp17 all [installed,automatic] Currently, my guess is that some applications mistakenly invokes rpc calls like this https://github.com/openstack/neutron/blob/490471ebd3ac56d0cee164b9c1c1211687e49437/neutron/api/rpc/agentnotifiers/l3_rpc_agent_api.py#L166 with dvr associated with a floating ip address on a host which has fixed ip address allocated from floating network (aka device_owner prefix with compute:). Then such router will be kept by this https://github.com/openstack/neutron/blob/490471ebd3ac56d0cee164b9c1c1211687e49437/neutron/db/l3_dvrscheduler_db.py#L427 function, because `get_subnet_ids_on_router` does not filter out router:gateway ports. I think this is a bug because as long as we do not have ports with specific device owners we should not have a dvr router on it. + + + besides it is pretty easy to replay this bug. + + First create a dvr router with an external gateway on floating network + Then create on virtual machine with fixed ip on floating network + Then call `routers_updated_on_host` manually, then this dvr will be created on the host where vm resides on, but actually it should be there. -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1840579 Title: excessive number of dvrs where vm got a fixed ip on floating network Status in neutron: In Progress Bug description: we are running into an unexpected situation where number of dvr routers is increasing to nearly 2000 on a compute node on which some instances got a nic on floating ip network. We are using Queens release, neutron-common/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed,automatic] neutron-l3-agent/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed] neutron-metadata-agent/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed,automatic] neutron-openvswitch-agent/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed] python-neutron/xenial,now 2:12.0.5-5~u16.04+mcp155 all [installed,automatic] python-neutron-fwaas/xenial,xenial,now 2:12.0.1-1.0~u16.04+mcp6 all [installed,automatic] python-neutron-lib/xenial,xenial,now 1.13.0-1.0~u16.04+mcp9 all [installed,automatic] python-neutronclient/xenial,xenial,now 1:6.7.0-1.0~u16.04+mcp17 all [installed,automatic] Currently, my guess is that some applications mistakenly invokes rpc calls like this https://github.com/openstack/neutron/blob/490471ebd3ac56d0cee164b9c1c1211687e49437/neutron/api/rpc/agentnotifiers/l3_rpc_agent_api.py#L166 with dvr associated with a floating ip address on a host which has fixed ip address allocated from floating network (aka device_owner prefix with compute:). Then such router will be kept by this https://github.com/openstack/neutron/blob/490471ebd3ac56d0cee164b9c1c1211687e49437/neutron/db/l3_dvrscheduler_db.py#L427 function, because `get_subnet_ids_on_router` does not filter out router:gateway ports. I think this is a bug because as long as we do not have ports with specific device owners we should not have a dvr router on it. besides it is pretty easy to replay this bug. First create a dvr router with an external gateway on floating network Then create on virtual machine with fixed ip on floating network Then call `routers_updated_on_host` manually, then this dvr will be created on the host where vm resides on, but actually it should be there. To manage notifications about this bug go to: https://bugs.launchpad.net/neutron/+bug/1840579/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp