I know what the problem is: (9:59:34 AM) mriedem: set_inventory_for_provider -> _ensure_resource_provider -> _create_resource_provider -> safe_connect returns None because it can't talk to placement yet (9:59:41 AM) mriedem: https://review.openstack.org/#/c/524618/2/nova/scheduler/client/report.py@516 (9:59:44 AM) mriedem: so we put None in the cache
** Changed in: nova Status: New => Triaged ** Also affects: nova/pike Importance: Undecided Status: New ** Also affects: nova/queens Importance: Undecided Status: New ** Changed in: nova Assignee: (unassigned) => Matt Riedemann (mriedem) ** Changed in: nova Importance: Undecided => High ** Changed in: nova/queens Status: New => Triaged ** Changed in: nova/pike Importance: Undecided => Medium ** Changed in: nova/queens Importance: Undecided => Medium ** Changed in: nova/pike Status: New => Triaged ** Changed in: nova Importance: High => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1767139 Title: TypeError in _get_inventory_and_update_provider_generation Status in OpenStack Compute (nova): Triaged Status in OpenStack Compute (nova) pike series: Triaged Status in OpenStack Compute (nova) queens series: Triaged Bug description: Description =========== Bringing up a new cluster as part of our CI after switch from 16.1.0 to 16.1.1 on Centos, I'm seeing this error on some computes: 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager Traceback (most recent call last): 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6752, in update_available_resource_for_node 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager rt.update_available_resource(context, nodename) 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 704, in update_available_resource 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._update_available_resource(context, resources) 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return f(*args, **kwargs) 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 728, in _update_available_resource 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._init_compute_node(context, resources) 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 585, in _init_compute_node 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._update(context, cn) 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 886, in _update 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager inv_data, 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 64, in set_inventory_for_provider 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager inv_data, 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py", line 37, in __run_method 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return getattr(self.instance, __name)(*args, **kwargs) 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 789, in set_inventory_for_provider 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager self._update_inventory(rp_uuid, inv_data) 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 56, in wrapper 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager return f(self, *a, **k) 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 675, in _update_inventory 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager if self._update_inventory_attempt(rp_uuid, inv_data): 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 562, in _update_inventory_attempt 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager curr = self._get_inventory_and_update_provider_generation(rp_uuid) 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/scheduler/client/report.py", line 546, in _get_inventory_and_update_provider_generation 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager if server_gen != my_rp['generation']: 2018-04-26 13:36:26.580 14536 ERROR nova.compute.manager TypeError: 'NoneType' object has no attribute '__getitem__' The error seems persistent for a single run of nova-compute. Steps to reproduce ================== Nodes were started by our CI infrastructure. We start 3 computes and a single control node. In 50% of cases, one of the computes comes up in this bad state. Expected result =============== Working cluster. Actual result ============= At least one of 3 nodes fails to join the cluster, it's not picked up by discover_hosts and I see the above stack trace repeated in the nova-compute logs. Environment =========== 1. Exact version of OpenStack you are running. See the following list for all releases: http://docs.openstack.org/releases/ $ rpm -qa | grep nova python-nova-16.1.1-1.el7.noarch openstack-nova-common-16.1.1-1.el7.noarch python2-novaclient-9.1.1-1.el7.noarch openstack-nova-api-16.1.1-1.el7.noarch openstack-nova-compute-16.1.1-1.el7.noarch 2. Which hypervisor did you use? (For example: Libvirt + KVM, Libvirt + XEN, Hyper-V, PowerKVM, ...) What's the version of that? $ rpm -qa | grep kvm libvirt-daemon-kvm-3.2.0-14.el7_4.9.x86_64 qemu-kvm-common-ev-2.9.0-16.el7_4.14.1.x86_64 qemu-kvm-ev-2.9.0-16.el7_4.14.1.x86_64 2. Which storage type did you use? (For example: Ceph, LVM, GPFS, ...) What's the version of that? Not sure 3. Which networking type did you use? (For example: nova-network, Neutron with OpenVSwitch, ...) Neutron with Calico (I work on Calico, this is our CI system) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1767139/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp