[Yahoo-eng-team] [Bug 1648242] Re: Failure to retry update_ha_routers_states
** Also affects: neutron (Ubuntu) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Xenial) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1648242 Title: Failure to retry update_ha_routers_states Status in Ubuntu Cloud Archive: New Status in neutron: Fix Released Status in neutron package in Ubuntu: New Status in neutron source package in Xenial: New Bug description: Version: Mitaka While performing failover testing of L3 HA routers, we've discovered an issue with regards to the failure of an agent to report its state. In this scenario, we have a router (7629f5d7-b205-4af5-8e0e- a3c4d15e7677) scheduled to (3) L3 agents: +--+--++---+--+ | id | host | admin_state_up | alive | ha_state | +--+--++---+--+ | 4434f999-51d0-4bbb-843c-5430255d5c64 | 726404-infra03-neutron-agents-container-a8bb0b1f | True | :-) | active | | 710e7768-df47-4bfe-917f-ca35c138209a | 726402-infra01-neutron-agents-container-fc937477 | True | :-) | standby | | 7f0888ba-1e8a-4a36-8394-6448b8c606fb | 726403-infra02-neutron-agents-container-0338af5a | True | :-) | standby | +--+--++---+--+ The infra03 node was shut down completely and abruptly. The router transitioned to master on infra02 as indicated in these log messages: 2016-12-06 16:15:06.457 18450 INFO neutron.agent.linux.interface [-] Device qg-d48918fa-eb already exists 2016-12-07 15:16:51.145 18450 INFO neutron.agent.l3.ha [-] Router c8b5d5b7-ab57-4f56-9838-0900dc304af6 transitioned to master 2016-12-07 15:16:51.811 18450 INFO eventlet.wsgi.server [-] - - [07/Dec/2016 15:16:51] "GET / HTTP/1.1" 200 115 0.666464 2016-12-07 15:18:29.167 18450 INFO neutron.agent.l3.ha [-] Router c8b5d5b7-ab57-4f56-9838-0900dc304af6 transitioned to backup 2016-12-07 15:18:29.229 18450 INFO eventlet.wsgi.server [-] - - [07/Dec/2016 15:18:29] "GET / HTTP/1.1" 200 115 0.062110 2016-12-07 15:21:48.870 18450 INFO neutron.agent.l3.ha [-] Router 7629f5d7-b205-4af5-8e0e-a3c4d15e7677 transitioned to master 2016-12-07 15:21:49.537 18450 INFO eventlet.wsgi.server [-] - - [07/Dec/2016 15:21:49] "GET / HTTP/1.1" 200 115 0.667920 2016-12-07 15:22:08.796 18450 INFO neutron.agent.l3.ha [-] Router 4676e7a5-279c-4114-8674-209f7fd5ab1a transitioned to master 2016-12-07 15:22:09.515 18450 INFO eventlet.wsgi.server [-] - - [07/Dec/2016 15:22:09] "GET / HTTP/1.1" 200 115 0.719848 Traffic to/from VMs through the new master router functioned as expected. However, the ha_state remained 'standby': +--+--++---+--+ | id | host | admin_state_up | alive | ha_state | +--+--++---+--+ | 4434f999-51d0-4bbb-843c-5430255d5c64 | 726404-infra03-neutron-agents-container-a8bb0b1f | True | xxx | standby | | 710e7768-df47-4bfe-917f-ca35c138209a | 726402-infra01-neutron-agents-container-fc937477 | True | :-) | standby | | 7f0888ba-1e8a-4a36-8394-6448b8c606fb | 726403-infra02-neutron-agents-container-0338af5a | True | :-) | standby | +--+--++---+--+ A traceback was observed in the logs related to a message timeout, probably due to the cut of AMQP on infra03: 2016-12-07 15:22:30.525 18450 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 172.29.237.155:5671 is unreachable: timed out. Trying again in 1 seconds. 2016-12-07 15:22:36.537 18450 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 172.29.237.155:5671 is unreachable: timed out. Trying again in 1 seconds. 2016-12-07 15:22:37.553 18450 INFO oslo.messaging._drivers.impl_rabbit [-] Reconnected to AMQP server on 172.29.238.65:5671 via [amqp] client 2016-12-07 15:22:51.210 18450 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 172.29.237.246:5671 is unreachable: Basic.cancel: (0) 1. Trying again in 1 seconds. 2016-12-07 15:22:52.262 18450 INFO oslo.messaging._drivers.impl_rabbit [-] Reconnected to AMQP server on 172.29.237.246:5671 via [amqp] client
[Yahoo-eng-team] [Bug 1602057] Re: (libvirt) KeyError updating resources for some node, guest.uuid is not in BDM list
** Also affects: nova (Ubuntu) Importance: Undecided Status: New ** Also affects: nova (Ubuntu Xenial) Importance: Undecided Status: New ** Also affects: Ubuntu Xenial Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1602057 Title: (libvirt) KeyError updating resources for some node, guest.uuid is not in BDM list Status in Ubuntu Cloud Archive: Fix Released Status in Ubuntu Cloud Archive mitaka series: Triaged Status in Ubuntu Cloud Archive newton series: Fix Released Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) mitaka series: Won't Fix Status in OpenStack Compute (nova) newton series: Fix Committed Status in nova package in Ubuntu: New Status in nova source package in Xenial: New Bug description: 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager [req-d5d5d486-b488-4429-bbb5-24c9f19ff2c0 - - - - -] Error updating resources for node controller. 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager Traceback (most recent call last): 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6726, in update_available_resource 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager rt.update_available_resource(context) 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 500, in update_available_resource 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager resources = self.driver.get_available_resource(self.nodename) 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5728, in get_available_resource 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager disk_over_committed = self._get_disk_over_committed_size_total() 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 7397, in _get_disk_over_committed_size_total 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager local_instances[guest.uuid], bdms[guest.uuid]) 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager KeyError: '0a5c5743-9555-4dfd-b26e-198449ebeee5' 2016-07-12 09:54:36.021 10056 ERROR nova.compute.manager To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1602057/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1666827] Re: Backport fixes for Rename Network return 403 Error
** Also affects: horizon (Ubuntu Yakkety) Importance: Undecided Status: New ** Also affects: horizon (Ubuntu Xenial) Importance: Undecided Status: New ** Also affects: horizon (Ubuntu Trusty) Importance: Undecided Status: New ** Changed in: horizon (Ubuntu Yakkety) Status: New => Fix Released ** Changed in: horizon (Ubuntu Trusty) Status: New => Triaged ** Changed in: horizon (Ubuntu Xenial) Status: New => Triaged -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1666827 Title: Backport fixes for Rename Network return 403 Error Status in Ubuntu Cloud Archive: New Status in OpenStack Dashboard (Horizon): New Status in horizon package in Ubuntu: New Status in horizon source package in Trusty: Triaged Status in horizon source package in Xenial: Triaged Status in horizon source package in Yakkety: Fix Released Bug description: [Impact] Non-admin users are not allowed to change the name of a network using the OpenStack Dashboard GUI [Test Case] 1. Deploy trusty-mitaka or xenial-mitaka OpenStack Cloud 2. Create demo project 3. Create demo user 4. Log into OpenStack Dashboard using demo user 5. Go to Project -> Network and create a network 6. Go to Project -> Network and Edit the just created network 7. Change the name and click Save 8. Observe that your request is denied with an error message [Regression Potential] Minimal. We are adding a patch already merged into upstream stable/mitaka for the horizon call to policy_check before sending request to Neutron when updating networks. The addition of rule "update_network:shared" to horizon's copy of Neutron policy.json is our own due to upstream not willing to back- port this required change. This rule is not referenced anywhere else in the code base so it will not affect other policy_check calls. Upstream bug: https://bugs.launchpad.net/horizon/+bug/1609467 To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1666827/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1298061] Re: nova should allow evacuate for an instance in the Error state
** Also affects: nova (Ubuntu) Importance: Undecided Status: New ** Also affects: nova (Ubuntu Trusty) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1298061 Title: nova should allow evacuate for an instance in the Error state Status in OpenStack Compute (nova): Fix Released Status in nova package in Ubuntu: New Status in nova source package in Trusty: In Progress Bug description: [Impact] * Instances in error state cannot be evacuated. [Test Case] * nova evacuate * nova refuses to evacuate the instance because of its state [Regression Potential] * None We currently allow reboot/rebuild/rescue for an instance in the Error state if the instance has successfully booted at least once. We should allow "evacuate" as well, since it is essentially a "rebuild" on a different compute node. This would be useful in a number of cases, in particular if an initial evacuation attempt fails (putting the instance into the Error state). To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1298061/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1570748] Re: Bug: resize instance after edit flavor with horizon
** Also affects: nova (Ubuntu Yakkety) Importance: Undecided Status: New ** Changed in: nova (Ubuntu Yakkety) Status: New => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1570748 Title: Bug: resize instance after edit flavor with horizon Status in Ubuntu Cloud Archive: New Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) kilo series: Fix Released Status in OpenStack Compute (nova) liberty series: Fix Committed Status in OpenStack Compute (nova) mitaka series: Fix Committed Status in nova-powervm: Fix Committed Status in tempest: Fix Released Status in nova package in Ubuntu: Fix Released Status in nova source package in Wily: New Status in nova source package in Xenial: New Status in nova source package in Yakkety: Fix Released Bug description: Error occured when resize instance after edit flavor with horizon (and also delete flavor used by instance) Reproduce step : 1. create flavor A 2. boot instance using flavor A 3. edit flavor with horizon (or delete flavor A) -> the result is same to edit or to delelet flavor because edit flavor means delete/recreate flavor) 4. resize or migrate instance 5. Error occured Log : nova-compute.log File "/opt/openstack/src/nova/nova/conductor/manager.py", line 422, in _object_dispatch return getattr(target, method)(*args, **kwargs) File "/opt/openstack/src/nova/nova/objects/base.py", line 163, in wrapper result = fn(cls, context, *args, **kwargs) File "/opt/openstack/src/nova/nova/objects/flavor.py", line 132, in get_by_id db_flavor = db.flavor_get(context, id) File "/opt/openstack/src/nova/nova/db/api.py", line 1479, in flavor_get return IMPL.flavor_get(context, id) File "/opt/openstack/src/nova/nova/db/sqlalchemy/api.py", line 233, in wrapper return f(*args, **kwargs) File "/opt/openstack/src/nova/nova/db/sqlalchemy/api.py", line 4732, in flavor_get raise exception.FlavorNotFound(flavor_id=id) FlavorNotFound: Flavor 7 could not be found. This Error is occured because of below code: /opt/openstack/src/nova/nova/compute/manager.py def resize_instance(self, context, instance, image, reservations, migration, instance_type, clean_shutdown=True): if (not instance_type or not isinstance(instance_type, objects.Flavor)): instance_type = objects.Flavor.get_by_id( context, migration['new_instance_type_id']) I think that deleted flavor should be taken when resize instance. I tested this in stable/kilo, but I think stable/liberty and stable/mitaka has same bug because source code is not changed. thanks. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1570748/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1570748] Re: Bug: resize instance after edit flavor with horizon
** Also affects: nova (Ubuntu) Importance: Undecided Status: New ** Also affects: nova (Ubuntu Xenial) Importance: Undecided Status: New ** Also affects: nova (Ubuntu Wily) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1570748 Title: Bug: resize instance after edit flavor with horizon Status in Ubuntu Cloud Archive: New Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) kilo series: Fix Released Status in OpenStack Compute (nova) liberty series: Fix Committed Status in OpenStack Compute (nova) mitaka series: Fix Committed Status in nova-powervm: Fix Committed Status in tempest: Fix Released Status in nova package in Ubuntu: Fix Released Status in nova source package in Wily: New Status in nova source package in Xenial: New Status in nova source package in Yakkety: Fix Released Bug description: Error occured when resize instance after edit flavor with horizon (and also delete flavor used by instance) Reproduce step : 1. create flavor A 2. boot instance using flavor A 3. edit flavor with horizon (or delete flavor A) -> the result is same to edit or to delelet flavor because edit flavor means delete/recreate flavor) 4. resize or migrate instance 5. Error occured Log : nova-compute.log File "/opt/openstack/src/nova/nova/conductor/manager.py", line 422, in _object_dispatch return getattr(target, method)(*args, **kwargs) File "/opt/openstack/src/nova/nova/objects/base.py", line 163, in wrapper result = fn(cls, context, *args, **kwargs) File "/opt/openstack/src/nova/nova/objects/flavor.py", line 132, in get_by_id db_flavor = db.flavor_get(context, id) File "/opt/openstack/src/nova/nova/db/api.py", line 1479, in flavor_get return IMPL.flavor_get(context, id) File "/opt/openstack/src/nova/nova/db/sqlalchemy/api.py", line 233, in wrapper return f(*args, **kwargs) File "/opt/openstack/src/nova/nova/db/sqlalchemy/api.py", line 4732, in flavor_get raise exception.FlavorNotFound(flavor_id=id) FlavorNotFound: Flavor 7 could not be found. This Error is occured because of below code: /opt/openstack/src/nova/nova/compute/manager.py def resize_instance(self, context, instance, image, reservations, migration, instance_type, clean_shutdown=True): if (not instance_type or not isinstance(instance_type, objects.Flavor)): instance_type = objects.Flavor.get_by_id( context, migration['new_instance_type_id']) I think that deleted flavor should be taken when resize instance. I tested this in stable/kilo, but I think stable/liberty and stable/mitaka has same bug because source code is not changed. thanks. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1570748/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1553815] Re: host keys never restored following metadata api outage
** Also affects: cloud-init (Ubuntu Trusty) Importance: Undecided Status: New ** Also affects: cloud-init (Ubuntu Wily) Importance: Undecided Status: New ** Also affects: cloud-init (Ubuntu Xenial) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to cloud-init. https://bugs.launchpad.net/bugs/1553815 Title: host keys never restored following metadata api outage Status in cloud-init: Fix Committed Status in cloud-init package in Ubuntu: New Status in cloud-init source package in Trusty: New Status in cloud-init source package in Wily: New Status in cloud-init source package in Xenial: Fix Released Bug description: We are running an Openstack cloud and have noticed some unexpected behaviour in our Ubuntu Trusty cloud instances created by Nova. We have observed that if a previously initialised instance (e.g. DataSourceOpenstack has already been run) is rebooted while the metadata api is not available (i.e. 169.254.169.264 is unreachable), cloud-init will retry a few times then switch to DataSourceNone and regenerate host keys. # Boot instance under normal conditions ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance /var/lib/cloud/instances/cd535bc4-9c2f-4d31-8903-0ede59c7ef95 ubuntu@vm1:~$ grep "Generating public/private rsa key pair." /var/log/cloud-init-output.log Generating public/private rsa key pair. # Stop neutron metadata api service and reboot instance (observing that host keys were regenerated) ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance /var/lib/cloud/instances/iid-datasource-none ubuntu@vm1:~$ grep "Generating public/private rsa key pair." /var/log/cloud-init-output.log Generating public/private rsa key pair. Generating public/private rsa key pair. So far so good since we expect this behaviour, but now we reboot this instance with the metadata api is once again reachable. Cloud-init rightly selects the original DataSourceOpenstack instance but it does nothing since it already ran once (and it is set to only run once). The problem here is that the original host keys are never restored so any client connecting to that instance will have no option to accept the new host keys along with MITM attack warning. ubuntu@vm1:~$ sudo reboot ... ubuntu@vm1:~$ readlink -f /var/lib/cloud/instance /var/lib/cloud/instances/cd535bc4-9c2f-4d31-8903-0ede59c7ef95 Surely we could find a way for cloud-init to know that if if the current DataSourceOpenstack uuid matches its previously run uuid, then it can check that the host keys are consistent with the original run. @smoser suggested in a side discussion that dmidecode info could perhaps be used since the Openstack instance uuid can be found there: ubuntu@vm1:~$ sudo dmidecode -t system # dmidecode 2.12 SMBIOS 2.8 present. Handle 0x0100, DMI type 1, 27 bytes System Information Manufacturer: OpenStack Foundation Product Name: OpenStack Nova Version: 13.0.0 Serial Number: ba5f7371-fd4c-a25e-132f-3dd1e5b92e93 UUID: CD535BC4-9C2F-4D31-8903-0EDE59C7EF95 Wake-up Type: Power Switch SKU Number: Not Specified Family: Virtual Machine Handle 0x2000, DMI type 32, 11 bytes System Boot Information Status: No errors detected If cloud-init kept a copy of previous host keys prior to regenerating them, it could presumably use this info to know when to safely restore the original host keys. Since it is not inconceivable for the metadata api to become unreachable for a brief period (perhpas during an upgrade), i think we really need to make cloud-init more tolerant of this circumstance. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1553815/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1288438] Re: Neutron server takes a long time to recover from VIP move
** Also affects: neutron (Ubuntu) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Trusty) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1288438 Title: Neutron server takes a long time to recover from VIP move Status in Fuel for OpenStack: Fix Committed Status in neutron: Fix Released Status in oslo-incubator: Fix Released Status in neutron package in Ubuntu: New Status in neutron source package in Trusty: New Bug description: Neutron waits sequentially for read_timeout seconds for each connection in its connection pool. The default pool_size is 10 so it takes 10 minutes for Neutron server to be available after the VIP is moved. This is log output from neutron-server after the VIP has been moved: 2014-03-05 17:48:23.844 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 17:49:23.887 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 17:50:24.055 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 17:51:24.067 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 17:52:24.079 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 17:53:24.115 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 17:54:24.123 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 17:55:24.131 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 17:56:24.143 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 17:57:24.163 9899 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') Here is the log output after the pool_size was changed to 7 and the read_timeout to 30. 2014-03-05 18:50:25.300 15731 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 18:50:55.331 15731 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 18:51:25.351 15731 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 18:51:55.387 15731 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 18:52:25.415 15731 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 18:52:55.427 15731 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 18:53:25.439 15731 WARNING neutron.openstack.common.db.sqlalchemy.session [-] Got mysql server has gone away: (2013, 'Lost connection to MySQL server during query') 2014-03-05 18:53:25.549 15731 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): 192.168.0.2 To manage notifications about this bug go to: https://bugs.launchpad.net/fuel/+bug/1288438/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1393391] Re: neutron-openvswitch-agent stuck on no queue 'q-agent-notifier-port-update_fanout..
** Also affects: neutron (Ubuntu) Importance: Undecided Status: New ** Also affects: neutron (Ubuntu Trusty) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to neutron. https://bugs.launchpad.net/bugs/1393391 Title: neutron-openvswitch-agent stuck on no queue 'q-agent-notifier-port- update_fanout.. Status in neutron: Confirmed Status in neutron package in Ubuntu: New Status in neutron source package in Trusty: New Bug description: Under an HA deployment, neutron-openvswitch-agent can get stuck when receiving a close command on a fanout queue the agent is not subscribed to. It stops responding to any other messages, so it stops effectively working at all. 2014-11-11 10:27:33.092 3027 INFO neutron.common.config [-] Logging enabled! 2014-11-11 10:27:34.285 3027 INFO neutron.openstack.common.rpc.common [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Connected to AMQP server on vip-rabbitmq:5672 2014-11-11 10:27:34.370 3027 INFO neutron.openstack.common.rpc.common [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Connected to AMQP server on vip-rabbitmq:5672 2014-11-11 10:27:35.348 3027 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Agent initialized successfully, now running... 2014-11-11 10:27:35.351 3027 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Agent out of sync with plugin! 2014-11-11 10:27:35.401 3027 INFO neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Agent tunnel out of sync with plugin! 2014-11-11 10:27:35.414 3027 INFO neutron.openstack.common.rpc.common [req-66ba318b-0fcc-42c2-959e-9a5233c292ef None] Connected to AMQP server on vip-rabbitmq:5672 2014-11-11 10:32:33.143 3027 INFO neutron.agent.securitygroups_rpc [req-22c7fa11-882d-4278-9f83-6dd56ab95ba4 None] Security group member updated [u'4c7b3ad2-4526-48a7-959e-a8b8e4da6413'] 2014-11-11 10:58:11.916 3027 INFO neutron.agent.securitygroups_rpc [req-484fd71f-8f61-496c-aa8a-2d3abf8de365 None] Security group member updated [u'4c7b3ad2-4526-48a7-959e-a8b8e4da6413'] 2014-11-11 10:59:43.954 3027 INFO neutron.agent.securitygroups_rpc [req-2c0bc777-04ed-470a-aec5-927a59100b89 None] Security group member updated [u'4c7b3ad2-4526-48a7-959e-a8b8e4da6413'] 2014-11-11 11:00:22.500 3027 INFO neutron.agent.securitygroups_rpc [req-df447d01-d132-40f2-8528-1c1c4d57c0f5 None] Security group member updated [u'4c7b3ad2-4526-48a7-959e-a8b8e4da6413'] 2014-11-12 01:27:35.662 3027 ERROR neutron.openstack.common.rpc.common [-] Failed to consume message from queue: Socket closed 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common Traceback (most recent call last): 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 579, in ensure 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common return method(*args, **kwargs) 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/impl_kombu.py", line 659, in _consume 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common return self.connection.drain_events(timeout=timeout) 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/site-packages/kombu/connection.py", line 281, in drain_events 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common return self.transport.drain_events(self.connection, **kwargs) 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/site-packages/kombu/transport/pyamqp.py", line 94, in drain_events 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common return connection.drain_events(**kwargs) 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 266, in drain_events 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common chanmap, None, timeout=timeout, 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 328, in _wait_multiple 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common channel, method_sig, args, content = read_timeout(timeout) 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common File "/usr/lib/python2.7/site-packages/amqp/connection.py", line 292, in read_timeout 2014-11-12 01:27:35.662 3027 TRACE neutron.openstack.common.rpc.common return self.method_reader.read_method() 2014-11-12
[Yahoo-eng-team] [Bug 1382079] Re: Project selector not working
** Also affects: horizon (Ubuntu) Importance: Undecided Status: New ** Also affects: horizon (Ubuntu Wily) Importance: Undecided Status: New ** Also affects: horizon (Ubuntu Xenial) Importance: Undecided Status: New ** Also affects: horizon (Ubuntu Vivid) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Dashboard (Horizon). https://bugs.launchpad.net/bugs/1382079 Title: Project selector not working Status in OpenStack Dashboard (Horizon): Fix Released Status in horizon package in Ubuntu: Fix Released Status in horizon source package in Vivid: New Status in horizon source package in Wily: Fix Released Status in horizon source package in Xenial: Fix Released Bug description: When you try to select a new project on the project dropdown, the project doesn't change. The commit below has introduced this bug on Horizon's master and has passed the tests verifications. https://github.com/openstack/horizon/commit/16db58fabad8934b8fbdfc6aee0361cc138b20af For what I've found so far, the context being received in the decorator seems to be the old context, with the token to the previous project. When you take the decorator out, the context received by the "can_access" function receives the correct context, with the token to the new project. Steps to reproduce: 1 - Enable Identity V3 (to have a huge token) 2 - Log in on Horizon (lots of permissions loaded on session) 3 - Certify that you SESSION_BACKEND is "signed_cookies" 4 - Try to change project on the dropdown The project shall remain the same. To manage notifications about this bug go to: https://bugs.launchpad.net/horizon/+bug/1382079/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1430042] Re: Virtual Machine could not be evacuated because virtual interface creation failed
** Also affects: ubuntu Importance: Undecided Status: New ** Also affects: Ubuntu Trusty Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1430042 Title: Virtual Machine could not be evacuated because virtual interface creation failed Status in ubuntu-cloud-archive: New Status in OpenStack Compute (nova): Fix Released Status in OpenStack Compute (nova) kilo series: Fix Released Status in Ubuntu: New Status in The Trusty Tahr: New Bug description: I believe this issue is related to Question 257358 (https://answers.launchpad.net/ubuntu/+source/nova/+question/257358). On the source host we see the successful vif plug: 2015-03-09 01:22:12.363 629 DEBUG neutron.plugins.ml2.rpc [req-5de70341-d64b-4a3a-bc05-54eb2802f25d None] Device 14ac5edd-269f-4808-9a34-c4cc93e9ab70 up at agent ovs-agent-ipx update_device_up /usr/lib/python2.7/site-packages/neutron/plugins/ml2/rpc.py:156 2015-03-09 01:22:12.392 629 DEBUG oslo_concurrency.lockutils [req-5de70341-d64b-4a3a-bc05-54eb2802f25d ] Acquired semaphore "db-access" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:377 2015-03-09 01:22:12.436 629 DEBUG oslo_concurrency.lockutils [req-5de70341-d64b-4a3a-bc05-54eb2802f25d ] Releasing semaphore "db-access" lock /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:390 2015-03-09 01:22:12.437 629 DEBUG oslo_messaging._drivers.amqp [req-5de70341-d64b-4a3a-bc05-54eb2802f25d ] UNIQUE_ID is 740634ca8c7a49418a39c429669f2f27. _add_unique_id /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqp.py:224 2015-03-09 01:22:12.439 629 DEBUG oslo_messaging._drivers.amqp [req-5de70341-d64b-4a3a-bc05-54eb2802f25d ] UNIQUE_ID is 3264e8d7dd7c492d9aa17d3e9892b1fc. _add_unique_id /usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqp.py:224 2015-03-09 01:22:14.436 629 DEBUG neutron.notifiers.nova [-] Sending events: [{'status': 'completed', 'tag': u'14ac5edd-269f-4808-9a34-c4cc93e9ab70', 'name': 'network-vif-plugged', 'server_uuid': u'2790be4a-5285-46aa-8ee2-c68f5b936c1d'}] send_events /usr/lib/python2.7/site-packages/neutron/notifiers/nova.py:237 Later, the destination host of the evacuation attempts to plug the vif but can't: 2015-03-09 02:15:41.441 629 DEBUG neutron.plugins.ml2.rpc [req-5ea6625c-a60c-48fb-9264-e2a5a3ed0d26 None] Device 14ac5edd-269f-4808-9a34-c4cc93e9ab70 up at agent ovs-agent-ipxx update_device_up /usr/lib/python2.7/site-packages/neutron/plugins/ml2/rpc.py:156 2015-03-09 02:15:41.485 629 DEBUG neutron.plugins.ml2.rpc [req-5ea6625c-a60c-48fb-9264-e2a5a3ed0d26 None] Device 14ac5edd-269f-4808-9a34-c4cc93e9ab70 not bound to the agent host ipx update_device_up /usr/lib/python2.7/site-packages/neutron/plugins/ml2/rpc.py:163 The cause of the problem seems to be that the neutron port does not have is binding:host_id properly updated on evacuation, the answer to question 257358 looks like the fix. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1430042/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1349888] Re: Attempting to attach the same volume multiple times can cause bdm record for existing attachment to be deleted.
** Also affects: nova (Ubuntu) Importance: Undecided Status: New ** Also affects: nova (Ubuntu Trusty) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1349888 Title: Attempting to attach the same volume multiple times can cause bdm record for existing attachment to be deleted. Status in OpenStack Compute (nova): Fix Released Status in nova package in Ubuntu: New Status in nova source package in Trusty: New Bug description: nova assumes there is only ever one bdm per volume. When an attach is initiated a new bdm is created, if the attach fails a bdm for the volume is deleted however it is not necessarily the one that was just created. The following steps show how a volume can get stuck detaching because of this. $ nova list c+--++++-+--+ | ID | Name | Status | Task State | Power State | Networks | +--++++-+--+ | cb5188f8-3fe1-4461-8a9d-3902f7cc8296 | test13 | ACTIVE | - | Running | private=10.0.0.2 | +--++++-+--+ $ cinder list +--+---++--+-+--+-+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--+---++--+-+--+-+ | c1e38e93-d566-4c99-bfc3-42e77a428cc4 | available | test10 | 1 | lvm1 | false | | +--+---++--+-+--+-+ $ nova volume-attach test13 c1e38e93-d566-4c99-bfc3-42e77a428cc4 +--+--+ | Property | Value| +--+--+ | device | /dev/vdb | | id | c1e38e93-d566-4c99-bfc3-42e77a428cc4 | | serverId | cb5188f8-3fe1-4461-8a9d-3902f7cc8296 | | volumeId | c1e38e93-d566-4c99-bfc3-42e77a428cc4 | +--+--+ $ cinder list +--+++--+-+--+--+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--+++--+-+--+--+ | c1e38e93-d566-4c99-bfc3-42e77a428cc4 | in-use | test10 | 1 | lvm1 | false | cb5188f8-3fe1-4461-8a9d-3902f7cc8296 | +--+++--+-+--+--+ $ nova volume-attach test13 c1e38e93-d566-4c99-bfc3-42e77a428cc4 ERROR (BadRequest): Invalid volume: status must be 'available' (HTTP 400) (Request-ID: req-1fa34b54-25b5-4296-9134-b63321b0015d) $ nova volume-detach test13 c1e38e93-d566-4c99-bfc3-42e77a428cc4 $ cinder list +--+---++--+-+--+--+ | ID | Status | Name | Size | Volume Type | Bootable | Attached to | +--+---++--+-+--+--+ | c1e38e93-d566-4c99-bfc3-42e77a428cc4 | detaching | test10 | 1 | lvm1 | false | cb5188f8-3fe1-4461-8a9d-3902f7cc8296 | +--+---++--+-+--+--+ 2014-07-29 14:47:13.952 ERROR oslo.messaging.rpc.dispatcher [req-134dfd17-14da-4de0-93fc-5d8d7bbf65a5 admin admin] Exception during message handling: can't be decoded 2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last): 2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply 2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher incoming.message)) 2014-07-29 14:47:13.952 31588 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch 2014-07-29 14:47:13.952 31588 TRACE
[Yahoo-eng-team] [Bug 1327218] Re: Volume detach failure because of invalid bdm.connection_info
** Also affects: nova (Ubuntu) Importance: Undecided Status: New ** Also affects: nova (Ubuntu Trusty) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1327218 Title: Volume detach failure because of invalid bdm.connection_info Status in OpenStack Compute (nova): Fix Released Status in nova package in Ubuntu: New Status in nova source package in Trusty: New Bug description: Example of this here: http://logs.openstack.org/33/97233/1/check/check-grenade- dsvm/f7b8a11/logs/old/screen-n-cpu.txt.gz?level=TRACE#_2014-06-02_14_13_51_125 File /opt/stack/old/nova/nova/compute/manager.py, line 4153, in _detach_volume connection_info = jsonutils.loads(bdm.connection_info) File /opt/stack/old/nova/nova/openstack/common/jsonutils.py, line 164, in loads return json.loads(s) File /usr/lib/python2.7/json/__init__.py, line 326, in loads return _default_decoder.decode(s) File /usr/lib/python2.7/json/decoder.py, line 366, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) TypeError: expected string or buffer and this was in grenade with stable/icehouse nova commit 7431cb9 There's nothing unusual about the test which triggers this - simply attaches a volume to an instance, waits for it to show up in the instance and then tries to detach it logstash query for this: message:Exception during message handling AND message:expected string or buffer AND message:connection_info = jsonutils.loads(bdm.connection_info) AND tags:screen-n-cpu.txt but it seems to be very rare To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1327218/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1459046] Re: [SRU] nova-* services do not start if rsyslog is not yet started
** Also affects: nova (Ubuntu Utopic) Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1459046 Title: [SRU] nova-* services do not start if rsyslog is not yet started Status in OpenStack Compute (nova): New Status in oslo.log: New Status in nova package in Ubuntu: In Progress Status in nova source package in Trusty: In Progress Status in nova source package in Utopic: In Progress Bug description: [Impact] * If Nova services are configured to log to syslog (use_syslog=True) they will currently fail with ECONNREFUSED if they cannot connect to syslog. This patch adds support for allowing nova to retry connecting a configurable number of times before print an error message and continuing with startup. [Test Case] * Configure nova with use_syslog=True in nova.conf, stop rsyslog service and restart nova services. Check that upstart nova logs to see retries occurring then start rsyslog and observe connection succeed and nova-compute startup. [Regression Potential] * None To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1459046/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp