** Also affects: nova/ocata Importance: Undecided Status: New ** Also affects: nova/rocky Importance: Undecided Status: New
** Also affects: nova/stein Importance: Undecided Status: New ** Also affects: nova/pike Importance: Undecided Status: New ** Also affects: nova/queens Importance: Undecided Status: New ** Changed in: nova/ocata Status: New => Triaged ** Changed in: nova/pike Status: New => Triaged ** Changed in: nova/queens Status: New => Triaged ** Changed in: nova/stein Status: New => Triaged ** Changed in: nova/pike Importance: Undecided => Medium ** Changed in: nova/rocky Importance: Undecided => Medium ** Changed in: nova/queens Importance: Undecided => Medium -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1839674 Title: ResourceTracker.compute_nodes won't try to create a ComputeNode a second time if the first create() fails Status in OpenStack Compute (nova): Triaged Status in OpenStack Compute (nova) ocata series: Triaged Status in OpenStack Compute (nova) pike series: Triaged Status in OpenStack Compute (nova) queens series: Triaged Status in OpenStack Compute (nova) rocky series: New Status in OpenStack Compute (nova) stein series: Triaged Bug description: I found this while writing a functional recreate test for bug 1839560. As of this change in Ocata: https://github.com/openstack/nova/commit/1c967593fbb0ab8b9dc8b0b509e388591d32f537 The ResourceTracker.compute_nodes dict will store the ComputeNode object *before* trying to create it: https://github.com/openstack/nova/blob/6b7d0caad86fe32ffc49a8672de1eb7258f3b919/nova/compute/resource_tracker.py#L570-L571 The problem is if ComputeNode.create() fails for whatever reason, the next run through update_available_resource won't try to create the ComputeNode again because of this: https://github.com/openstack/nova/blob/6b7d0caad86fe32ffc49a8672de1eb7258f3b919/nova/compute/resource_tracker.py#L546 And eventually you get errors like this: b'2019-08-09 17:02:59,356 ERROR [nova.compute.manager] Error updating resources for node node2.' b'Traceback (most recent call last):' b' File "/home/osboxes/git/nova/nova/compute/manager.py", line 8250, in _update_available_resource_for_node' b' startup=startup)' b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 715, in update_available_resource' b' self._update_available_resource(context, resources, startup=startup)' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_concurrency/lockutils.py", line 328, in inner' b' return f(*args, **kwargs)' b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 796, in _update_available_resource' b' self._update(context, cn, startup=startup)' b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 1052, in _update' b' self.old_resources[nodename] = old_compute' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__' b' self.force_reraise()' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise' b' six.reraise(self.type_, self.value, self.tb)' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/six.py", line 693, in reraise' b' raise value' b' File "/home/osboxes/git/nova/nova/compute/resource_tracker.py", line 1046, in _update' b' compute_node.save()' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 226, in wrapper' b' return fn(self, *args, **kwargs)' b' File "/home/osboxes/git/nova/nova/objects/compute_node.py", line 352, in save' b' db_compute = db.compute_node_update(self._context, self.id, updates)' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 67, in getter' b' self.obj_load_attr(name)' b' File "/home/osboxes/git/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_versionedobjects/base.py", line 603, in obj_load_attr' b' _("Cannot load \'%s\' in the base class") % attrname)' b"NotImplementedError: Cannot load 'id' in the base class" We should only map the ComputeNode when we've successfully created it. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1839674/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp