[Yahoo-eng-team] [Bug 1767139] [NEW] TypeError in _get_inventory_and_update_provider_generation

Shaun Crampton Thu, 26 Apr 2018 08:43:44 -0700

Public bug reported:

Description
===========


Bringing up a new cluster as part of our CI after switch from 16.1.0 to
16.1.1 on Centos, I'm seeing this error on some computes:

2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager Traceback (most recent 
call last):
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6752, in 
update_available_resource_for_node
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
rt.update_available_resource(context, nodename)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 704, 
in update_available_resource
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
self._update_available_resource(context, resources)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in 
inner
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     return f(*args, 
**kwargs)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 728, 
in _update_available_resource
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
self._init_compute_node(context, resources)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 585, 
in _init_compute_node
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
self._update(context, cn)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 886, 
in _update
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     inv_data,
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 64, 
in set_inventory_for_provider
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     inv_data,
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, 
in __run_method
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     return 
getattr(self.instance, __name)(*args, **kwargs)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 789, 
in set_inventory_for_provider
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
self._update_inventory(rp_uuid, inv_data)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 56, in 
wrapper
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     return f(self, *a, 
**k)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 675, 
in _update_inventory
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     if 
self._update_inventory_attempt(rp_uuid, inv_data):
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 562, 
in _update_inventory_attempt
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     curr = 
self._get_inventory_and_update_provider_generation(rp_uuid)
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 546, 
in _get_inventory_and_update_provider_generation
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     if server_gen != 
my_rp['generation']:
2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager TypeError: 'NoneType' 
object has no attribute '__getitem__'

The error seems persistent for a single run of nova-compute.

Steps to reproduce
==================

Nodes were started by our CI infrastructure.  We start 3 computes and a
single control node.  In 50% of cases, one of the computes comes up in
this bad state.

Expected result
===============

Working cluster.

Actual result
=============

At least one of 3 nodes fails to join the cluster, it's not picked up by
discover_hosts and I see the above stack trace repeated in the nova-
compute logs.

Environment
===========
1. Exact version of OpenStack you are running. See the following
  list for all releases: http://docs.openstack.org/releases/

$ rpm -qa | grep nova
python-nova-16.1.1-1.el7.noarch
openstack-nova-common-16.1.1-1.el7.noarch
python2-novaclient-9.1.1-1.el7.noarch
openstack-nova-api-16.1.1-1.el7.noarch
openstack-nova-compute-16.1.1-1.el7.noarch


2. Which hypervisor did you use?
   (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
   What's the version of that?

$ rpm -qa | grep kvm
libvirt-daemon-kvm-3.2.0-14.el7_4.9.x86_64
qemu-kvm-common-ev-2.9.0-16.el7_4.14.1.x86_64
qemu-kvm-ev-2.9.0-16.el7_4.14.1.x86_64

2. Which storage type did you use?
   (For example: Ceph, LVM, GPFS, ...)
   What's the version of that?

Not sure

3. Which networking type did you use?
   (For example: nova-network, Neutron with OpenVSwitch, ...)

Neutron with Calico (I work on Calico, this is our CI system)

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1767139

Title:
  TypeError in _get_inventory_and_update_provider_generation

Status in OpenStack Compute (nova):
  New

Bug description:
  Description
  ===========

  Bringing up a new cluster as part of our CI after switch from 16.1.0
  to 16.1.1 on Centos, I'm seeing this error on some computes:

  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager Traceback (most 
recent call last):
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6752, in 
update_available_resource_for_node
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
rt.update_available_resource(context, nodename)
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 704, 
in update_available_resource
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
self._update_available_resource(context, resources)
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in 
inner
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     return f(*args, 
**kwargs)
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 728, 
in _update_available_resource
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
self._init_compute_node(context, resources)
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 585, 
in _init_compute_node
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
self._update(context, cn)
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 886, 
in _update
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     inv_data,
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 64, 
in set_inventory_for_provider
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     inv_data,
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, 
in __run_method
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     return 
getattr(self.instance, __name)(*args, **kwargs)
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 789, 
in set_inventory_for_provider
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     
self._update_inventory(rp_uuid, inv_data)
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 56, in 
wrapper
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     return f(self, 
*a, **k)
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 675, 
in _update_inventory
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     if 
self._update_inventory_attempt(rp_uuid, inv_data):
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 562, 
in _update_inventory_attempt
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     curr = 
self._get_inventory_and_update_provider_generation(rp_uuid)
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager   File 
"/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 546, 
in _get_inventory_and_update_provider_generation
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager     if server_gen != 
my_rp['generation']:
  2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager TypeError: 
'NoneType' object has no attribute '__getitem__'

  The error seems persistent for a single run of nova-compute.

  Steps to reproduce
  ==================

  Nodes were started by our CI infrastructure.  We start 3 computes and
  a single control node.  In 50% of cases, one of the computes comes up
  in this bad state.

  Expected result
  ===============

  Working cluster.

  Actual result
  =============

  At least one of 3 nodes fails to join the cluster, it's not picked up
  by discover_hosts and I see the above stack trace repeated in the
  nova-compute logs.

  Environment
  ===========
  1. Exact version of OpenStack you are running. See the following
    list for all releases: http://docs.openstack.org/releases/

  $ rpm -qa | grep nova
  python-nova-16.1.1-1.el7.noarch
  openstack-nova-common-16.1.1-1.el7.noarch
  python2-novaclient-9.1.1-1.el7.noarch
  openstack-nova-api-16.1.1-1.el7.noarch
  openstack-nova-compute-16.1.1-1.el7.noarch

  
  2. Which hypervisor did you use?
     (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...)
     What's the version of that?

  $ rpm -qa | grep kvm
  libvirt-daemon-kvm-3.2.0-14.el7_4.9.x86_64
  qemu-kvm-common-ev-2.9.0-16.el7_4.14.1.x86_64
  qemu-kvm-ev-2.9.0-16.el7_4.14.1.x86_64

  2. Which storage type did you use?
     (For example: Ceph, LVM, GPFS, ...)
     What's the version of that?

  Not sure

  3. Which networking type did you use?
     (For example: nova-network, Neutron with OpenVSwitch, ...)

  Neutron with Calico (I work on Calico, this is our CI system)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1767139/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1767139] [NEW] TypeError in _get_inventory_and_update_provider_generation

Reply via email to