[Yahoo-eng-team] [Bug 1605046] [NEW] Router delete causes L3 Agent hang

2016-07-20 Thread James Dempsey
Public bug reported:

* High level description

Occasionally, deleting a router will cause the VPN Agent to get stuck in
a loop trying to tear down a deleted router.

* Pre-conditions

Fault seems random, only pre-conditions are: router delete

* Step-by-step reproduction steps

Delete a router.  There is a chance that the VPN Agent will get stuck.

* Expected output

VPN agent removes router.

* Actual output

VPN Agent spins in a tight loop, attempting to execute

ip netns exec qrouter-e14edfa6-a3e1-4866-8a1a-ee6ecf0f4a67 find
/sys/class/net -maxdepth 1 -type l -printf %f

This command fails because the namespace does not exist.  The VPN agent
immediately attempts the same command.

Neutron stack traces accumulate and fill up disks *very* quickly.


* Version
  Neutron VPN Agent 7.0.4.3
  Ubuntu 14.04.3
  4.2 kernel

* Perceived severity
  Active Production issue, wakes Ops staff up from time to time.

* Logs are attached, but basically this stack trace happens every 100
ms.

2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent [-] Error while 
deleting router 69e961ca-6b64-4085-833a-7796b2fce233
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Traceback (most 
recent call last):
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
 line 346, in _safe_router_removed
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent 
self._router_removed(router_id)
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/agent.py",
 line 364, in _router_removed
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent ri.delete(self)
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
 line 270, in delete
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent 
self.process_delete(agent)
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/common/utils.py",
 line 359, in call
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent self.logger(e)
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 221, in __exit__
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent 
self.force_reraise()
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/oslo_utils/excutils.py",
 line 197, in force_reraise
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent 
six.reraise(self.type_, self.value, self.tb)
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/common/utils.py",
 line 356, in call
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent return 
func(*args, **kwargs)
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
 line 727, in process_delete
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent 
self._process_internal_ports(agent.pd)
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
 line 444, in _process_internal_ports
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent existing_devices 
= self._get_existing_devices()
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/l3/router_info.py",
 line 336, in _get_existing_devices
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent ip_devs = 
ip_wrapper.get_devices(exclude_loopback=True)
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py",
 line 125, in get_devices
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent 
log_fail_as_error=self.log_fail_as_error
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent   File 
"/opt/cat/openstack/neutron/local/lib/python2.7/site-packages/neutron/agent/linux/utils.py",
 line 159, in execute
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent raise 
RuntimeError(m)
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent RuntimeError:
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Command: ['sudo', 
'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 
'qrouter-69e961ca-6b64-4085-833a-7796b2fce233', 'find', '/sys/class/net', 
'-maxdepth', '1', '-type', 'l', '-printf', '%f ']
2016-07-20 14:03:41.243 6152 ERROR neutron.agent.l3.agent Exit code: 1

[Yahoo-eng-team] [Bug 1592167] [NEW] Deleted keypair causes metadata failure

2016-06-13 Thread James Dempsey
Public bug reported:

Description
===

If a user deletes a keypair that was used to create an instance, that
instance receives HTTP 400 errors when attempting to get metadata via
http://169.254.169.254/openstack/latest/meta_data.json.

This causes problems in the instance when cloud-init fails to retrieve
the OpenStack datasource.

Steps to reproduce
==

1. Create instance with SSH keypair defined.
2. Delete SSH keypair
3. Attempt 'curl http://169.254.169.254/openstack/latest/meta_data.json' from 
the instance

Expected result
===

Instance receives metadata from
http://169.254.169.254/openstack/latest/meta_data.json

Actual result
=

Instance receives HTTP 400 error.  Additionally, Ubuntu Cloud Image
instances will fail back to the ec2 datasource and re-generate Host SSH
keys.

Environment
===

Nova:   2015.1.4.2
Hypervisor: Libvirt + KVM
Storage:Ceph
Network:Liberty Neutron ML2+OVS


Logs


[req-a8385839-6993-4289-96dc-1714afe82597 - - - - -] FaultWrapper error
Traceback (most recent call last):
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/api/ec2/__init__.py",
 line 93, in __call__
return req.get_response(self.application)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/webob/request.py", 
line 1299, in send
application, catch_exc_info=False)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/webob/request.py", 
line 1263, in call_application
app_iter = application(self.environ, start_response)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/webob/dec.py", line 
130, in __call__
resp = self.call_func(req, *args, **self.kwargs)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/webob/dec.py", line 
195, in call_func
return self.func(req, *args, **kwargs)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/api/ec2/__init__.py",
 line 105, in __call__
rv = req.get_response(self.application)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/webob/request.py", 
line 1299, in send
application, catch_exc_info=False)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/webob/request.py", 
line 1263, in call_application
app_iter = application(self.environ, start_response)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/webob/dec.py", line 
130, in __call__
resp = self.call_func(req, *args, **self.kwargs)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/webob/dec.py", line 
195, in call_func
return self.func(req, *args, **kwargs)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/api/metadata/handler.py",
 line 137, in __call__
data = meta_data.lookup(req.path_info)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/api/metadata/base.py",
 line 418, in lookup
data = self.get_openstack_item(path_tokens[1:])
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/api/metadata/base.py",
 line 297, in get_openstack_item
return self._route_configuration().handle_path(path_tokens)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/api/metadata/base.py",
 line 491, in handle_path
return path_handler(version, path)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/api/metadata/base.py",
 line 316, in _metadata_as_json
self.instance.key_name)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/objects/base.py",
 line 163, in wrapper
result = fn(cls, context, *args, **kwargs)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/objects/keypair.py",
 line 60, in get_by_name
db_keypair = db.key_pair_get(context, user_id, name)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/db/api.py", 
line 937, in key_pair_get
return IMPL.key_pair_get(context, user_id, name)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py",
 line 233, in wrapper
return f(*args, **kwargs)
  File 
"/opt/cat/openstack/nova/local/lib/python2.7/site-packages/nova/db/sqlalchemy/api.py",
 line 2719, in key_pair_get
raise exception.KeypairNotFound(user_id=user_id, name=name)
KeypairNotFound: Keypair keypair_name not found for user 


** Affects: nova
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1592167

Title:
  Deleted keypair causes metadata failure

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===

  If a user deletes a keypair that was used to create an instance, that
  instance receives HTTP 400 errors when attempting to get metadata via
  

[Yahoo-eng-team] [Bug 1483480] [NEW] RFE - Allow annotations on Neutron resources

2015-08-10 Thread James Dempsey
Public bug reported:

Management of security groups and rules is very difficult without the
ability to annotate.  Some sort of optional annotation field on Neutron
resources would make the cloud much more usable.

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: rfe

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1483480

Title:
  RFE - Allow annotations on Neutron resources

Status in neutron:
  New

Bug description:
  Management of security groups and rules is very difficult without the
  ability to annotate.  Some sort of optional annotation field on
  Neutron resources would make the cloud much more usable.

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1483480/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1430003] [NEW] Corrupt POSTROUTING Chain when using Metering and VPN agents together

2015-03-09 Thread James Dempsey
Public bug reported:

I'm using the Icehouse UCA version 1:2014.1.3-0ubuntu1~cloud0 of the
VPN(openswan driver) and Metering(iptables driver) agents on Ubuntu
Precise.

The ordering of the POSTROUTING chain in the NAT table inside router
namespaces seems to be broken.  In many of my routers, the neutron-
postrouting-bottom rule is listed before the neutron-vpn-agen-
POSTROUTING rule.  This causes traffic that should have traversed a VPN
to be Source NAT'd as if it were traffic leaving via the default route.
Rescheduling the router or removing metering rules for it seem to cause
a re-ordering of rules.

It seems to me that neutron-postrouting-bottom should always be the last
rule in the POSTROUTING chain.  Is this correct?

In the following state, VPNs are broken regardless of the existence of
Phase 1 and Phase 2 IPsec

Chain POSTROUTING (policy ACCEPT 2194K packets, 129M bytes)
 pkts bytes target prot opt in out source   destination 

2199K  129M neutron-meter-POSTROUTING  all  --  *  *   0.0.0.0/0
0.0.0.0/0   
2568K  152M neutron-postrouting-bottom  all  --  *  *   0.0.0.0/0   
 0.0.0.0/0   
2563K  151M neutron-vpn-agen-POSTROUTING  all  --  *  *   0.0.0.0/0 
   0.0.0.0/0  

Removing metering rules can cause the above chain to be changed into the
following, which I assume (but have not verified) would break metering,
were it enabled.

Chain POSTROUTING (policy ACCEPT 2199K packets, 129M bytes)
 pkts bytes target prot opt in out source   destination 

2569K  152M neutron-vpn-agen-POSTROUTING  all  --  *  *   0.0.0.0/0 
   0.0.0.0/0   
2574K  152M neutron-postrouting-bottom  all  --  *  *   0.0.0.0/0   
 0.0.0.0/0   
2204K  130M neutron-meter-POSTROUTING  all  --  *  *   0.0.0.0/0
0.0.0.0/0

** Affects: neutron
 Importance: Undecided
 Status: New


** Tags: l3 l3-agent metering vpnaas

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1430003

Title:
  Corrupt POSTROUTING Chain when using Metering and VPN agents together

Status in OpenStack Neutron (virtual network service):
  New

Bug description:
  I'm using the Icehouse UCA version 1:2014.1.3-0ubuntu1~cloud0 of the
  VPN(openswan driver) and Metering(iptables driver) agents on Ubuntu
  Precise.

  The ordering of the POSTROUTING chain in the NAT table inside router
  namespaces seems to be broken.  In many of my routers, the neutron-
  postrouting-bottom rule is listed before the neutron-vpn-agen-
  POSTROUTING rule.  This causes traffic that should have traversed a
  VPN to be Source NAT'd as if it were traffic leaving via the default
  route.  Rescheduling the router or removing metering rules for it seem
  to cause a re-ordering of rules.

  It seems to me that neutron-postrouting-bottom should always be the
  last rule in the POSTROUTING chain.  Is this correct?

  In the following state, VPNs are broken regardless of the existence of
  Phase 1 and Phase 2 IPsec

  Chain POSTROUTING (policy ACCEPT 2194K packets, 129M bytes)
   pkts bytes target prot opt in out source   
destination 
  2199K  129M neutron-meter-POSTROUTING  all  --  *  *   0.0.0.0/0  
  0.0.0.0/0   
  2568K  152M neutron-postrouting-bottom  all  --  *  *   0.0.0.0/0 
   0.0.0.0/0   
  2563K  151M neutron-vpn-agen-POSTROUTING  all  --  *  *   0.0.0.0/0   
 0.0.0.0/0  

  Removing metering rules can cause the above chain to be changed into
  the following, which I assume (but have not verified) would break
  metering, were it enabled.

  Chain POSTROUTING (policy ACCEPT 2199K packets, 129M bytes)
   pkts bytes target prot opt in out source   
destination 
  2569K  152M neutron-vpn-agen-POSTROUTING  all  --  *  *   0.0.0.0/0   
 0.0.0.0/0   
  2574K  152M neutron-postrouting-bottom  all  --  *  *   0.0.0.0/0 
   0.0.0.0/0   
  2204K  130M neutron-meter-POSTROUTING  all  --  *  *   0.0.0.0/0  
  0.0.0.0/0

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1430003/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp