[Yahoo-eng-team] [Bug 1838592] [NEW] WebSSO unable to support multiple identity providers

2019-07-31 Thread Guang Yee
Public bug reported:

When performing WebSSO authentication (i.e. openid connect), if there
are multiple identity providers exist, regardless of protocol and
mapping association, Keystone will yield the following error.

Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application [None 
req-00ae9c5a-5d05-43d9-b15b-585720f7aefa None None] Could not find federated 
protocol openid for Identity Provider: 4afcec6e3c45565103e8f71665dff443f3e>
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application Traceback (most recent call last):
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 266, in 
error_router
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application return self.handle_error(e)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/usr/lib/python2.7/site-packages/flask/app.py", line 1949, in 
full_dispatch_request
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application rv = self.dispatch_request()
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/usr/lib/python2.7/site-packages/flask/app.py", line 1935, in dispatch_request
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application return 
self.view_functions[rule.endpoint](**req.view_args)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 458, in 
wrapper
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application resp = resource(*args, **kwargs)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/usr/lib/python2.7/site-packages/flask/views.py", line 89, in view
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application return self.dispatch_request(*args, 
**kwargs)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/usr/lib/python2.7/site-packages/flask_restful/__init__.py", line 573, in 
dispatch_request
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application resp = meth(*args, **kwargs)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/opt/stack/keystone/keystone/server/flask/common.py", line 1064, in wrapper
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application return f(*args, **kwargs)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/opt/stack/keystone/keystone/api/auth.py", line 359, in get
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application return self._perform_auth(protocol_id)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/opt/stack/keystone/keystone/api/auth.py", line 340, in _perform_auth
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application idp, protocol_id)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/opt/stack/keystone/keystone/federation/utils.py", line 286, in 
get_remote_id_parameter
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application protocol_ref = 
PROVIDERS.federation_api.get_protocol(idp['id'], protocol)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/opt/stack/keystone/keystone/federation/backends/sql.py", line 279, in 
get_protocol
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application protocol_ref = 
self._get_protocol(session, idp_id, protocol_id)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application   File 
"/opt/stack/keystone/keystone/federation/backends/sql.py", line 255, in 
_get_protocol
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application raise 
exception.FederatedProtocolNotFound(**kwargs)
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 
keystone.server.flask.application FederatedProtocolNotFound: Could not find 
federated protocol openid for Identity Provider: 
4afcec6e3c45565103e8f71665dff443f3eff2107ade89918207aa60d95063a3.
Aug 01 03:41:21 localhost devstack@keystone.service[26546]: ERROR 

[Yahoo-eng-team] [Bug 1838587] [NEW] request neutron with Incorrect body key return 500

2019-07-31 Thread ZhouHeng
Public bug reported:

In current neutron, when I update resource with incorrect body, neutron
server return 500 NeutronError. It should be fixed 400(BadRequest)

example:
PUT /v2.0/networks/
body:
{
"subnet": {
...
 }
}
neutron server return 
{"NeutronError": {"message": "Request Failed: internal server error while 
processing your request.", "type": "HTTPInternalServerError", "detail": ""}}


Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/pecan/core.py", line 683, in __call__
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation self.invoke_controller(controller, 
args, kwargs, state)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/pecan/core.py", line 574, in invoke_controller
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation result = controller(*args, **kwargs)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/neutron_lib/db/api.py", line 139, in wrapped
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation setattr(e, '_RETRY_EXCEEDED', True)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation self.force_reraise()
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, 
self.tb)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/neutron_lib/db/api.py", line 135, in wrapped
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_db/api.py", line 154, in wrapper
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation ectxt.value = e.inner_exc
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation self.force_reraise()
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, 
self.tb)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_db/api.py", line 142, in wrapper
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/neutron_lib/db/api.py", line 183, in wrapped
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation LOG.debug("Retry wrapper got retriable 
exception: %s", e)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation self.force_reraise()
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in 
force_reraise
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation six.reraise(self.type_, self.value, 
self.tb)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/usr/lib/python2.7/site-packages/neutron_lib/db/api.py", line 179, in wrapped
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation return f(*dup_args, **dup_kwargs)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation   File 
"/opt/stack/neutron/neutron/pecan_wsgi/controllers/utils.py", line 76, in 
wrapped
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 
neutron.pecan_wsgi.hooks.translation return f(*args, **kwargs)
Jul 31 20:56:08 nfs neutron-server[11250]: ERROR 

[Yahoo-eng-team] [Bug 1834176] Re: [RFE] Neutron enhancements to support per-physnet and IPoIB interface drivers

2019-07-31 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/670723
Committed: 
https://git.openstack.org/cgit/openstack/neutron/commit/?id=0e80d2251e6b1cf8521418a703cff7cb149cb8e1
Submitter: Zuul
Branch:master

commit 0e80d2251e6b1cf8521418a703cff7cb149cb8e1
Author: Adrian Chiris 
Date:   Sun Jul 14 11:08:31 2019 +0300

Pass get_networks() callback to interface driver

In order to support out of tree interface drivers it is required
to pass a callback to allow the drivers to query information about
the network.

- Allow passing **kwargs to interface drivers
- Pass get_networks() as `get_networks_cb` kw arg
  `get_networks_cb` has the same API as
  `neutron.neutron_plugin_base_v2.NeutronPluginBaseV2.get_networks()`
   minus the the request context which will be embeded in the callback
   itself.

The out of tree interface drivers in question are:

MultiInterfaceDriver - a per-physnet interface driver that delegates
   operations on a per-physnet basis.
IPoIBInterfaceDriver - an interface driver for IPoIB (IP over Infiniband)
   networks.

Those drivers are a part of networking-mlnx[1], Their implementation
is vendor agnostic so they can later be moved to a more common place
if desired.

[1] https://github.com/openstack/networking-mlnx

Change-Id: I74d9f449fb24f64548b0f6db4d5562f7447efb25
Closes-Bug: #1834176


** Changed in: neutron
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1834176

Title:
  [RFE] Neutron enhancements to support per-physnet and IPoIB interface
  drivers

Status in neutron:
  Fix Released

Bug description:
  Networking-mlnx will add support for the following interface drivers:

  IPoIBInterfaceDriver - An interface driver to plug/unplug IPoIB interfaces.
     This driver will allow Neutron's DHCP and L3 agents to
     provide Routing and DHCP functionality on IP over
     Infiniband networks[1]

  MultiInterfaceDriver - An interface driver that (as the name suggests) 
supports
     multiple interface drivers. It delegates the 
operations to
     drivers on a per-physnet basis. This driver enable L3 
and
     DHCP agents to provide DHCP and Routing 
functionalities on
     a per physnet basis allowing the same agent to provide
     DHCP/Routing for both ethernet and infiniband based
     physnets.

  Relevant commits in netwokring-mlnx can be found here[2]

  For these drivers to integrate with Neutron, some minimal changes are
  required in neutron codebase.

  1. Add interface kind property to `ip_lib.IPDevice` to
     express the type of interface created by an interface driver

  2. Add get_networks() RPC to L3 and DHCP agent

  3. Pass get_networks() as kwarg to interface driver constructor.

  Relevant commits for neutron can be found here[3]

  It should be noted that both interface drivers are vendor agnostic.
  If there will be community interest in enabling IP networks over Infiniband 
fabric. The defined interface drivers can be moved to a common place.

  This RFE was repurposed from its original version in accordance to the
  neutron drivers meeting from 12.07.2019[4]

  [1] https://tools.ietf.org/html/rfc4390 - DHCP over infiniband
  https://tools.ietf.org/html/rfc4391 - Transmission of IPoIB
  https://tools.ietf.org/html/rfc4392 - IPoIB
  [2] https://review.opendev.org/#/c/670724
  [3] https://review.opendev.org/#/c/670723
  [4] 
http://eavesdrop.openstack.org/irclogs/%23openstack-meeting/%23openstack-meeting.2019-07-12.log.html#t2019-07-12T14:00:07

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1834176/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1838438] Re: Rename custom theme on drop down menu

2019-07-31 Thread Ivan Kolodyazhny
Paul, theme name is configured in the horizon config via
AVAILABLE_THEMES option [1]

[1]
https://docs.openstack.org/horizon/latest/configuration/settings.html
#available-themes

** Changed in: horizon
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1838438

Title:
  Rename custom theme on drop down menu

Status in OpenStack Dashboard (Horizon):
  Invalid

Bug description:
  When dashboard has a custom theme it is shown on the drop down menu as
  "custom".

  Is it possible to change the name of the theme from "custom" or can
  this feature be added?

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1838438/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1838563] [NEW] Timeout in executing ovs command crash ovs agent

2019-07-31 Thread Slawek Kaplonski
Public bug reported:

In case when there is timeout while executing command in ovs command
during agent initialization, agent crash and will not try to recover.

Example of such error in CI: http://logs.openstack.org/84/673784/1/check
/tempest-multinode-full-
py3/283e76b/compute1/logs/screen-q-agt.txt.gz#_Jul_31_17_48_48_877755

** Affects: neutron
 Importance: High
 Assignee: Slawek Kaplonski (slaweq)
 Status: Confirmed


** Tags: gate-failure ovs

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to neutron.
https://bugs.launchpad.net/bugs/1838563

Title:
  Timeout in executing ovs command crash ovs agent

Status in neutron:
  Confirmed

Bug description:
  In case when there is timeout while executing command in ovs command
  during agent initialization, agent crash and will not try to recover.

  Example of such error in CI:
  http://logs.openstack.org/84/673784/1/check/tempest-multinode-full-
  py3/283e76b/compute1/logs/screen-q-agt.txt.gz#_Jul_31_17_48_48_877755

To manage notifications about this bug go to:
https://bugs.launchpad.net/neutron/+bug/1838563/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1838564] [NEW] cloud-init's initramfs handling should support dracut

2019-07-31 Thread Dan Watkins
Public bug reported:

Currently, cloud-init has support for generating network configuration
from the /run/net-* files that klibc writes out if it performs network
configuration in the initramfs.  To better support iSCSI root for
distributions that use dracut for their initramfs, we should implement
support for handling its network configuration data.

** Affects: cloud-init
 Importance: Wishlist
 Status: Triaged

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to cloud-init.
https://bugs.launchpad.net/bugs/1838564

Title:
  cloud-init's initramfs handling should support dracut

Status in cloud-init:
  Triaged

Bug description:
  Currently, cloud-init has support for generating network configuration
  from the /run/net-* files that klibc writes out if it performs network
  configuration in the initramfs.  To better support iSCSI root for
  distributions that use dracut for their initramfs, we should implement
  support for handling its network configuration data.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1838564/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1838554] [NEW] Specify keystone is OS user for fernet and credential setup

2019-07-31 Thread Mihail Milev
Public bug reported:

- [ ] This doc is inaccurate in this way: __
- [x] This is a doc addition request.
- [ ] I have a fix to the document that I can paste below including example: 
input and output. 

I would suggest, that in chapter "Install and configure components"
point 4 "Initialize Fernet key repositories" is clarified in a way that
the reader knows the username "keystone" and the group name "keystone"
are not free to be chosen, but specify the operating system's user and
group "keystone".

** Affects: keystone
 Importance: Undecided
 Status: New


** Tags: documentation

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Identity (keystone).
https://bugs.launchpad.net/bugs/1838554

Title:
  Specify keystone is OS user for fernet and credential setup

Status in OpenStack Identity (keystone):
  New

Bug description:
  - [ ] This doc is inaccurate in this way: __
  - [x] This is a doc addition request.
  - [ ] I have a fix to the document that I can paste below including example: 
input and output. 

  I would suggest, that in chapter "Install and configure components"
  point 4 "Initialize Fernet key repositories" is clarified in a way
  that the reader knows the username "keystone" and the group name
  "keystone" are not free to be chosen, but specify the operating
  system's user and group "keystone".

To manage notifications about this bug go to:
https://bugs.launchpad.net/keystone/+bug/1838554/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1607400] Re: UEFI not supported on SLES

2019-07-31 Thread OpenStack Infra
** Changed in: nova
   Status: Invalid => In Progress

** Changed in: nova
 Assignee: Dirk Mueller (dmllr) => Kashyap Chamarthy (kashyapc)

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1607400

Title:
  UEFI not supported on SLES

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  Launching an image with UEFI bootloader on a SLES 12 SP1 instances
  gives

  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800] Traceback (most recent call last):
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800]   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2218, in 
_build_resources
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800] yield resources
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800]   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2064, in 
_build_and_run_instance
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800] block_device_info=block_device_info)
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800]   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2777, in 
spawn
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800] write_to_disk=True)
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800]   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4730, in 
_get_guest_xml
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800] context)
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800]   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4579, in 
_get_guest_config
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800] root_device_name)
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800]   File 
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4401, in 
_configure_guest_by_virt_type
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800] raise exception.UEFINotSupported()
  2016-07-28 08:23:12.820 3224 ERROR nova.compute.manager [instance: 
5289d6f7-f4f5-4f95-bd55-4812ec3ab800] UEFINotSupported: UEFI is not supported

  this is because the function probes for files that are in different
  locations on SLES, namely it looks for "/usr/share/OVMF/OVMF_CODE.fd"
  / /usr/share/AAVMF/AAVMF_CODE.fd which are the documented upstream
  defaults. However the SLES libvirt is compiled to default to different
  paths, that exist.

  one possibility would be to introspect domCapabilities from libvirt,
  which works just fine. An alternative patch is to just add the
  alternative paths for now.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1607400/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1838541] Re: Spurious warnings in compute logs while building/unshelving an instance: Instance cf1dc8a6-48fe-42fd-90a7-d352c58e1454 is not being actively managed by this compute

2019-07-31 Thread Matt Riedemann
Technically this goes back to Pike but I'm not sure we care about fixing
it there at this point since Pike is in Extended Maintenance mode
upstream. Someone can backport it to stable/pike if they care to.

** Also affects: nova/stein
   Importance: Undecided
   Status: New

** Also affects: nova/queens
   Importance: Undecided
   Status: New

** Also affects: nova/rocky
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1838541

Title:
  Spurious warnings in compute logs while building/unshelving an
  instance: Instance cf1dc8a6-48fe-42fd-90a7-d352c58e1454 is not being
  actively managed by this compute host but has allocations referencing
  this compute host: {u'resources': {u'VCPU': 1, u'MEMORY_MB': 64}}.
  Skipping heal of allocation because we do not know what to do.

Status in OpenStack Compute (nova):
  In Progress
Status in OpenStack Compute (nova) queens series:
  Confirmed
Status in OpenStack Compute (nova) rocky series:
  Confirmed
Status in OpenStack Compute (nova) stein series:
  Confirmed

Bug description:
  This warning log from the ResourceTracker is logged quite a bit in CI
  runs:

  
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22is%20not%20being%20actively%20managed%20by%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22=7d

  2601 hits in 7 days.

  Looking at one of these the warning shows up while spawning the
  instance during an unshelve operation. This is a possible race for the
  rt.instance_claim call because the instance.host/node are set here:

  
https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L208

  before the instance would be added to the rt.tracked_instances dict
  started here:

  
https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L217

  If the update_available_resource periodic task runs between those
  times, we'll call _remove_deleted_instances_allocations with the
  instance and it will have allocations on the node, created by the
  scheduler, but may not be in tracked_instances yet so we don't short-
  circuit here:

  
https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1339

  And hit the log condition here:

  
https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1397

  We should probably downgrade that warning to DEBUG if the instance
  task_state is set since clearly the instance is undergoing some state
  transition. We should log the task_state and only log the message as a
  warning if the instance does not have a task_state set but is also not
  tracked on the host.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1838541/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1838541] [NEW] Spurious warnings in compute logs while building/unshelving an instance: Instance cf1dc8a6-48fe-42fd-90a7-d352c58e1454 is not being actively managed by this comput

2019-07-31 Thread Matt Riedemann
Public bug reported:

This warning log from the ResourceTracker is logged quite a bit in CI
runs:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22is%20not%20being%20actively%20managed%20by%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22=7d

2601 hits in 7 days.

Looking at one of these the warning shows up while spawning the instance
during an unshelve operation. This is a possible race for the
rt.instance_claim call because the instance.host/node are set here:

https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L208

before the instance would be added to the rt.tracked_instances dict
started here:

https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L217

If the update_available_resource periodic task runs between those times,
we'll call _remove_deleted_instances_allocations with the instance and
it will have allocations on the node, created by the scheduler, but may
not be in tracked_instances yet so we don't short-circuit here:

https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1339

And hit the log condition here:

https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1397

We should probably downgrade that warning to DEBUG if the instance
task_state is set since clearly the instance is undergoing some state
transition. We should log the task_state and only log the message as a
warning if the instance does not have a task_state set but is also not
tracked on the host.

** Affects: nova
 Importance: Medium
 Status: Triaged


** Tags: resource-tracker serviceability

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1838541

Title:
  Spurious warnings in compute logs while building/unshelving an
  instance: Instance cf1dc8a6-48fe-42fd-90a7-d352c58e1454 is not being
  actively managed by this compute host but has allocations referencing
  this compute host: {u'resources': {u'VCPU': 1, u'MEMORY_MB': 64}}.
  Skipping heal of allocation because we do not know what to do.

Status in OpenStack Compute (nova):
  Triaged

Bug description:
  This warning log from the ResourceTracker is logged quite a bit in CI
  runs:

  
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22is%20not%20being%20actively%20managed%20by%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22=7d

  2601 hits in 7 days.

  Looking at one of these the warning shows up while spawning the
  instance during an unshelve operation. This is a possible race for the
  rt.instance_claim call because the instance.host/node are set here:

  
https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L208

  before the instance would be added to the rt.tracked_instances dict
  started here:

  
https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L217

  If the update_available_resource periodic task runs between those
  times, we'll call _remove_deleted_instances_allocations with the
  instance and it will have allocations on the node, created by the
  scheduler, but may not be in tracked_instances yet so we don't short-
  circuit here:

  
https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1339

  And hit the log condition here:

  
https://github.com/openstack/nova/blob/619c0c676aae5359225c54bc27ce349e138e420e/nova/compute/resource_tracker.py#L1397

  We should probably downgrade that warning to DEBUG if the instance
  task_state is set since clearly the instance is undergoing some state
  transition. We should log the task_state and only log the message as a
  warning if the instance does not have a task_state set but is also not
  tracked on the host.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1838541/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1834349] Re: Error updating resources for node ubuntu-bionic-ovh-bhs1-0008373888.: AttributeError: 'NoneType' object has no attribute 'flavorid' - race with resize confirm

2019-07-31 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/667687
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=818419c9d313bd6151d1a05b3a087a15116f61b8
Submitter: Zuul
Branch:master

commit 818419c9d313bd6151d1a05b3a087a15116f61b8
Author: Matt Riedemann 
Date:   Wed Jun 26 13:26:44 2019 -0400

Fix AttributeError in RT._update_usage_from_migration

Change Ieb539c9a0cfbac743c579a1633234537a8e3e3ee in Stein
added some logging in _update_usage_from_migration to log
the flavor for an inbound and outbound migration.

If an instance is resized and then the resize is immediately
confirmed, it's possible to race with ComputeManager._confirm_resize
setting the instance.old_flavor to None before the migration
status is changed to "confirmed" while the update_available_resource
periodic is running which will result in _update_usage_from_migration
hitting an AttributeError when trying to log instance.old_flavor.flavorid
since instance.old_flavor is None.

There are a few key points there:

- We get into _update_usage_from_migration because the
  _update_available_resource method gets in-progress migrations
  related to the host (in this case the source compute) and the
  migration is consider in-progress until its status is "confirmed".

- The instance is not in the tracked_instances dict when
  _update_usage_from_migration runs because RT only tracks instances
  where the instance.host matches the RT.host and in this case the
  instance has been resized to another compute and the instance.host
  is pointing at the dest compute.

The fix here is to simply check if we got the instance.old_flavor and
not log the message if we do not have it, which gets us back to the old
behavior.

This bug was found by noticing it in CI job logs - there is a link to
hits in logstash in the bug report.

As for the "incoming and not tracked" case in _update_usage_from_migration
I have not modified that since I am not sure we have the same race nor
have I seen it in CI logs.

Change-Id: I43e34b3ff1424d42632a3e8f842c93508905aa1a
Closes-Bug: #1834349


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1834349

Title:
  Error updating resources for node ubuntu-bionic-ovh-bhs1-0008373888.:
  AttributeError: 'NoneType' object has no attribute 'flavorid' - race
  with resize confirm

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) stein series:
  Confirmed

Bug description:
  Seeing this in CI jobs:

  http://logs.openstack.org/45/666845/1/check/neutron-tempest-dvr-ha-
  multinode-
  full/5b09053/controller/logs/screen-n-cpu.txt#_Jun_26_13_51_48_973568

  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager [None req-903b3e73-3ce3-4f5c-9a30-811383077679 None 
None] Error updating resources for node ubuntu-bionic-ovh-bhs1-0008373888.: 
AttributeError: 'NoneType' object has no attribute 'flavorid'
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager Traceback (most recent call last):
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager   File "/opt/stack/nova/nova/compute/manager.py", 
line 8101, in _update_available_resource_for_node
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager startup=startup)
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager   File 
"/opt/stack/nova/nova/compute/resource_tracker.py", line 735, in 
update_available_resource
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager self._update_available_resource(context, 
resources, startup=startup)
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager   File 
"/usr/local/lib/python3.6/dist-packages/oslo_concurrency/lockutils.py", line 
328, in inner
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager return f(*args, **kwargs)
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager   File 
"/opt/stack/nova/nova/compute/resource_tracker.py", line 783, in 
_update_available_resource
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager self._update_usage_from_migrations(context, 
migrations, nodename)
  Jun 26 13:51:48.973568 ubuntu-bionic-ovh-bhs1-0008373888 nova-compute[27912]: 
ERROR nova.compute.manager   File 

[Yahoo-eng-team] [Bug 1664793] Re: test_stamp_pattern timing out waiting for attached device to show up in guest

2019-07-31 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/615434
Committed: 
https://git.openstack.org/cgit/openstack/tempest/commit/?id=ba18426fd990fad19f429e0aa1673f549f2c77e8
Submitter: Zuul
Branch:master

commit ba18426fd990fad19f429e0aa1673f549f2c77e8
Author: Attila Fazekas 
Date:   Sun Nov 4 13:54:30 2018 +0100

Unskip test_stamp_pattern

test_stamp_pattern had issues before because the test attached volumes
in VM state when it does not detects hotplug events.

This change have the test to ssh the machine first,
alternativly a pci rescan could be forced.

Notes:

https://docs.google.com/presentation/d/1Im-iYVzroKwXKP23p12Q5vsUGdk2V26SPpLWF3I5dbA/edit#slide=id.p

Closes-bug: #1664793
Change-Id: Iaff1e01dd7ffab238ec73668ae4eee0683f70ffd


** Changed in: tempest
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1664793

Title:
  test_stamp_pattern timing out waiting for attached device to show up
  in guest

Status in OpenStack Compute (nova):
  Confirmed
Status in tempest:
  Fix Released

Bug description:
  The Tempest scenario test TestStampPattern was unskipped on 2/13:

  https://review.openstack.org/#/c/431800/

  Since then the ceph jobs have been failing, e.g.:

  http://logs.openstack.org/25/433825/1/check/gate-tempest-dsvm-full-
  devstack-plugin-ceph-ubuntu-
  trusty/4a58a2e/console.html#_2017-02-14_23_40_42_737153

  2017-02-14 23:40:42.736898 | 
tempest.scenario.test_stamp_pattern.TestStampPattern.test_stamp_pattern[compute,id-10fd234a-515c-41e5-b092-8323060598c5,image,network,volume]
  2017-02-14 23:40:42.736948 | 
-
  2017-02-14 23:40:42.736959 | 
  2017-02-14 23:40:42.736975 | Captured traceback:
  2017-02-14 23:40:42.736992 | ~~~
  2017-02-14 23:40:42.737013 | Traceback (most recent call last):
  2017-02-14 23:40:42.737037 |   File "tempest/test.py", line 103, in 
wrapper
  2017-02-14 23:40:42.737061 | return f(self, *func_args, **func_kwargs)
  2017-02-14 23:40:42.737094 |   File 
"tempest/scenario/test_stamp_pattern.py", line 112, in test_stamp_pattern
  2017-02-14 23:40:42.737115 | keypair['private_key'])
  2017-02-14 23:40:42.737153 |   File 
"tempest/scenario/test_stamp_pattern.py", line 89, in 
_wait_for_volume_available_on_the_system
  2017-02-14 23:40:42.737176 | raise lib_exc.TimeoutException
  2017-02-14 23:40:42.737203 | tempest.lib.exceptions.TimeoutException: 
Request timed out
  2017-02-14 23:40:42.737218 | Details: None

  
  They fail while waiting for the attached volume to show up on the guest, and 
it never does.

  
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22_wait_for_volume_available_on_the_system%5C%22%20AND%20tags%3A%5C%22console%5C%22%20AND%20build_name%3A*ceph*=7d

  There are only 11 hits in the ceph jobs in the last 24 hours but
  that's still pretty high. The ceph xenial job which runs on newton,
  ocata and master is non-voting so people probably aren't noticing, but
  the ceph trusty job runs on stable/mitaka and is voting and is
  failing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1664793/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1838522] [NEW] Disabled quotas break floating IPs

2019-07-31 Thread Marek Lyčka
Public bug reported:

Setting  OPENSTACK_NEUTRON_NETWORK['enable_quotas'] breaks the floating
ip allocation dialog in project/instances

To reproduce:

- Make sure OPENSTACK_NEUTRON_NETWORK['enable_quotas'] is set to false
(settings.py, local_settings.py); this is currently the default setting
in master.

- Make sure an instance is running

- Navigate to Project/Instances

- Open the "Associate Floating IP" action for an instance

- Click the + button next to the list of available floating ips

- An IP allocation modal should come up

Two  Issues manifest:

- The quota display in the bottom right of the dialog is broken: it does
not display numbers or the graphical representation of quotas

- Attempting to allocate a new IP by clicking the 'Allocate IP' button
fails with a (non-)exception in Horizon's log; it simply states
'available'. This is caused by an unhandled keyerror
/openstack_dashboard/dashboards/project/floating_ips/forms.py:59. With
disabled quotas, the floatingip dictionary doesn't contain 'available'.

Reproduced on Stein devstack and Horizon master.

** Affects: horizon
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Dashboard (Horizon).
https://bugs.launchpad.net/bugs/1838522

Title:
  Disabled quotas break floating IPs

Status in OpenStack Dashboard (Horizon):
  New

Bug description:
  Setting  OPENSTACK_NEUTRON_NETWORK['enable_quotas'] breaks the
  floating ip allocation dialog in project/instances

  To reproduce:

  - Make sure OPENSTACK_NEUTRON_NETWORK['enable_quotas'] is set to false
  (settings.py, local_settings.py); this is currently the default
  setting in master.

  - Make sure an instance is running

  - Navigate to Project/Instances

  - Open the "Associate Floating IP" action for an instance

  - Click the + button next to the list of available floating ips

  - An IP allocation modal should come up

  Two  Issues manifest:

  - The quota display in the bottom right of the dialog is broken: it
  does not display numbers or the graphical representation of quotas

  - Attempting to allocate a new IP by clicking the 'Allocate IP' button
  fails with a (non-)exception in Horizon's log; it simply states
  'available'. This is caused by an unhandled keyerror
  /openstack_dashboard/dashboards/project/floating_ips/forms.py:59. With
  disabled quotas, the floatingip dictionary doesn't contain
  'available'.

  Reproduced on Stein devstack and Horizon master.

To manage notifications about this bug go to:
https://bugs.launchpad.net/horizon/+bug/1838522/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp


[Yahoo-eng-team] [Bug 1764556] Re: "nova list" fails with exception.ServiceNotFound if service is deleted and has no UUID

2019-07-31 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/582408
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=16e163053ca39886f11fdb8a3af10a28619fc105
Submitter: Zuul
Branch:master

commit 16e163053ca39886f11fdb8a3af10a28619fc105
Author: melanie witt 
Date:   Thu Jul 12 21:48:23 2018 +

Don't generate service UUID for deleted services

In Pike, we added a UUID field to services and during an upgrade from
Ocata => Pike, when instances are accessed joined with their associated
services, we generate a UUID for the services on-the-fly.

This causes a problem in the scenario where an operator upgrades their
cluster and has old, deleted services with hostnames matching existing
services associated with instances. When we go to generate the service
UUID for the old, deleted service, we hit a ServiceTooOld exception.

This addresses the problem by not bothering to generate a UUID for a
deleted service. One alternative would be to exclude deleted services
when we join the 'instances' and 'services' tables, but I'm not sure
whether that approach might cause unintended effects where service
information that used to be viewable for instances becomes hidden.

Closes-Bug: #1778305
Closes-Bug: #1764556

Change-Id: I347096a527c257075cefe7b81210622f6cd87daf


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1764556

Title:
  "nova list" fails with exception.ServiceNotFound if service is deleted
  and has no UUID

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) pike series:
  Confirmed
Status in OpenStack Compute (nova) queens series:
  Confirmed
Status in OpenStack Compute (nova) rocky series:
  Confirmed
Status in OpenStack Compute (nova) stein series:
  Confirmed

Bug description:
  We had a testcase where we booted an instance on Newton, migrated it
  off the compute node, deleted the compute node (and service), upgraded
  to Pike, created a new compute node with the same name, and migrated
  the instance back to the compute node.

  At this point the "nova list" command failed with
  exception.ServiceNotFound.

  It appears that since the Service has no UUID the _from_db_object()
  routine will try to add it, but the service.save() call fails because
  the service in question has been deleted.

  I reproduced the issue with stable/pike devstack.  I booted an
  instance, then created a fake entry in the "services" table without a
  UUID so the table looked like this:

  mysql>  select * from services;
  
+-+-+-++--++---+--+--+-+-+-+-+-+--+
  | created_at  | updated_at  | deleted_at  | id | host 
| binary | topic | report_count | disabled | deleted | 
disabled_reason | last_seen_up| forced_down | version | uuid
 |
  
+-+-+-++--++---+--+--+-+-+-+-+-+--+
  | 2018-02-20 16:10:07 | 2018-04-16 22:10:46 | NULL|  1 | 
devstack | nova-conductor | conductor |   477364 |0 |   0 | 
NULL| 2018-04-16 22:10:46 |   0 |  22 | 
c041d7cf-5047-4014-b50c-3ba6b5d95097 |
  | 2018-02-20 16:10:10 | 2018-04-16 22:10:54 | NULL|  2 | 
devstack | nova-compute   | compute   |   477149 |0 |   0 | 
NULL| 2018-04-16 22:10:54 |   0 |  22 | 
d0cfb63c-8b59-4b65-bb7e-6b89acd3fe35 |
  | 2018-02-20 16:10:10 | 2018-04-16 20:29:33 | 2018-04-16 20:30:33 |  3 | 
devstack | nova-compute   | compute   |   476432 |0 |   3 | 
NULL| 2018-04-16 20:30:33 |   0 |  22 | NULL
 |
  
+-+-+-++--++---+--+--+-+-+-+-+-+--+


  At this point, running "nova show " worked fine, but running
  "nova list" failed:

  stack@devstack:~/devstack$ nova list
  ERROR (ClientException): Unexpected API Error. Please report this at 
http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
   (HTTP 500) (Request-ID: 
req-b7e1b5f9-e7b4-4ccf-ba28-e8b3e1acd2f6)

  
  The nova-api log looked like this:

  Apr 16 22:11:00 devstack devstack@n-api.service[4258]: DEBUG 

[Yahoo-eng-team] [Bug 1778305] Re: Nova may erronously look up service version of a deleted service, when hostname have been reused

2019-07-31 Thread OpenStack Infra
Reviewed:  https://review.opendev.org/582408
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=16e163053ca39886f11fdb8a3af10a28619fc105
Submitter: Zuul
Branch:master

commit 16e163053ca39886f11fdb8a3af10a28619fc105
Author: melanie witt 
Date:   Thu Jul 12 21:48:23 2018 +

Don't generate service UUID for deleted services

In Pike, we added a UUID field to services and during an upgrade from
Ocata => Pike, when instances are accessed joined with their associated
services, we generate a UUID for the services on-the-fly.

This causes a problem in the scenario where an operator upgrades their
cluster and has old, deleted services with hostnames matching existing
services associated with instances. When we go to generate the service
UUID for the old, deleted service, we hit a ServiceTooOld exception.

This addresses the problem by not bothering to generate a UUID for a
deleted service. One alternative would be to exclude deleted services
when we join the 'instances' and 'services' tables, but I'm not sure
whether that approach might cause unintended effects where service
information that used to be viewable for instances becomes hidden.

Closes-Bug: #1778305
Closes-Bug: #1764556

Change-Id: I347096a527c257075cefe7b81210622f6cd87daf


** Changed in: nova
   Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1778305

Title:
  Nova may erronously look up service version of a deleted service, when
  hostname have been reused

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) pike series:
  Confirmed
Status in OpenStack Compute (nova) queens series:
  Confirmed
Status in OpenStack Compute (nova) rocky series:
  Confirmed
Status in OpenStack Compute (nova) stein series:
  Confirmed

Bug description:
  Prerequisites:

  - A compute node running an old version of nova has been deleted. (In our 
case, version 9)
  - The hostname of said compute node has been reused, and has been upgraded as 
per normal. (To version 16)
  - The services table in the nova database contains both the old and the new 
node defined, where the deleted one are clearly marked as deleted - and with 
the old version specified in the version column.  The new node also exist, 
upgraded as it is.
  - One has at least one instance running on the upgraded node.
  - Perform upgrade from ocata to pike
  - Any projects with instances running on the upgraded node, may erronously 
get an error message that "ERROR (BadRequest): This service is older (v9) than 
the minimum (v16) version of the rest of the deployment. Unable to continue. 
(HTTP 400) (Request-ID: req-3e0ababe-e09b-4ef8-ba3a-43060bc1f807)" --- when 
performing 'nova list'.

  
  Example of how this may look in the database:

  MariaDB [nova]> SELECT * FROM services WHERE host = 'node11.acme.org';
  
+-+-+-+-+-+--+-+--+--+-+-+-+-+-+--+
  | created_at  | updated_at  | deleted_at  | id  | 
host| binary   | topic   | report_count | disabled | deleted | 
disabled_reason | last_seen_up| forced_down | version | uuid
 |
  
+-+-+-+-+-+--+-+--+--+-+-+-+-+-+--+
  | 2017-10-17 13:06:10 | 2018-06-22 21:42:42 | NULL| 179 | 
node11.acme.org | nova-compute | compute |  2138069 |0 |   0 | 
NULL| 2018-06-22 21:42:42 |   0 |  22 | 
63e1cb55-ee00-4cb8-b304-160dd5c45fdd |
  | 2016-08-13 08:20:05 | 2016-11-15 00:01:21 | 2016-11-27 15:11:30 | 104 | 
node11.acme.org | nova-compute | compute |   796220 |1 | 104 | 
NULL| 2016-11-15 00:01:21 |   0 |   9 | NULL
 |
  
+-+-+-+-+-+--+-+--+--+-+-+-+-+-+--+
  2 rows in set (0.01 sec)


  Removing the old service from the database is an effective workaround
  for this problem.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1778305/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe :