Public bug reported: When the nova power sync pool is exhausted the compute service will go down. This results in scale and performance tests failing.
2018-06-12 19:58:48.871 30126 WARNING oslo.messaging._drivers.impl_rabbit [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during heartbeart thread processing, retrying...: error: [Errno 104] Connection reset by peer 2018-06-12 19:58:48.872 30126 WARNING oslo.messaging._drivers.impl_rabbit [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during heartbeart thread processing, retrying...: error: [Errno 104] Connection reset by peer 2018-06-12 19:58:54.793 30126 WARNING oslo.messaging._drivers.impl_rabbit [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during heartbeart thread processing, retrying...: error: [Errno 104] Connection reset by peer 2018-06-12 21:37:23.805 30126 DEBUG oslo_concurrency.lockutils [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Lock "compute_resources" released by "nova.compute.resource_tracker._update_available_resource" :: held 6004.943s inner /usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:288 2018-06-12 21:37:23.807 30126 ERROR nova.compute.manager [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Error updating resources for node domain-c7.fd3d2358-cc8d-4773-9fef-7a2713ac05ba.: MessagingTimeout: Timed out waiting for a reply to message ID 1eb4b1b40f0f4c66b0266608073717e8 root@controller01:/var/log/nova# vi nova-conductor.log.1 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager [req-77b5e1d7-a4b7-468e-98af-dfdfbf2fad7f 1b5d8da24b39464cb6736d122ccc0665 eb361d7bc9bd40059a2ce2848c985772 - default default] Failed to schedule instances: NoValidHost_Remote: No valid host was found. There are not enough hosts available. Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 226, in inner return func(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 153, in select_destinations allocation_request_version, return_alternates) File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", line 93, in select_destinations allocation_request_version, return_alternates) File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", line 245, in _schedule claimed_instance_uuids) File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", line 282, in _ensure_sufficient_hosts raise exception.NoValidHost(reason=reason) NoValidHost: No valid host was found. There are not enough hosts available. 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager Traceback (most recent call last): 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 1118, in schedule_and_build_instances 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, return_alternates=True) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 718, in _schedule_instances 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return_alternates=return_alternates) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/utils.py", line 727, in wrapped 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return func(*args, **kwargs) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 53, in select_destinations 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, return_objects, return_alternates) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 37, in __run_method 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return getattr(self.instance, __name)(*args, **kwargs) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/query.py", line 42, in select_destinations 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, return_objects, return_alternates) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/rpcapi.py", line 158, in select_destinations 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return cctxt.call(ctxt, 'select_destinations', **msg_args) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 174, in call 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager retry=self.retry) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 131, in _send 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager timeout=timeout, retry=retry) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in send 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager retry=retry) ** Affects: nova Importance: Undecided Assignee: Gary Kotton (garyk) Status: In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1776621 Title: Scale: when periodic pool size is small and there is a lot of load the compute service goes down Status in OpenStack Compute (nova): In Progress Bug description: When the nova power sync pool is exhausted the compute service will go down. This results in scale and performance tests failing. 2018-06-12 19:58:48.871 30126 WARNING oslo.messaging._drivers.impl_rabbit [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during heartbeart thread processing, retrying...: error: [Errno 104] Connection reset by peer 2018-06-12 19:58:48.872 30126 WARNING oslo.messaging._drivers.impl_rabbit [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during heartbeart thread processing, retrying...: error: [Errno 104] Connection reset by peer 2018-06-12 19:58:54.793 30126 WARNING oslo.messaging._drivers.impl_rabbit [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during heartbeart thread processing, retrying...: error: [Errno 104] Connection reset by peer 2018-06-12 21:37:23.805 30126 DEBUG oslo_concurrency.lockutils [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Lock "compute_resources" released by "nova.compute.resource_tracker._update_available_resource" :: held 6004.943s inner /usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:288 2018-06-12 21:37:23.807 30126 ERROR nova.compute.manager [req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 d37803befc35418981f1f0b6dceec696 - default default] Error updating resources for node domain-c7.fd3d2358-cc8d-4773-9fef-7a2713ac05ba.: MessagingTimeout: Timed out waiting for a reply to message ID 1eb4b1b40f0f4c66b0266608073717e8 root@controller01:/var/log/nova# vi nova-conductor.log.1 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager [req-77b5e1d7-a4b7-468e-98af-dfdfbf2fad7f 1b5d8da24b39464cb6736d122ccc0665 eb361d7bc9bd40059a2ce2848c985772 - default default] Failed to schedule instances: NoValidHost_Remote: No valid host was found. There are not enough hosts available. Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 226, in inner return func(*args, **kwargs) File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 153, in select_destinations allocation_request_version, return_alternates) File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", line 93, in select_destinations allocation_request_version, return_alternates) File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", line 245, in _schedule claimed_instance_uuids) File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", line 282, in _ensure_sufficient_hosts raise exception.NoValidHost(reason=reason) NoValidHost: No valid host was found. There are not enough hosts available. 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager Traceback (most recent call last): 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 1118, in schedule_and_build_instances 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, return_alternates=True) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 718, in _schedule_instances 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return_alternates=return_alternates) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/utils.py", line 727, in wrapped 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return func(*args, **kwargs) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 53, in select_destinations 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, return_objects, return_alternates) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 37, in __run_method 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return getattr(self.instance, __name)(*args, **kwargs) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/query.py", line 42, in select_destinations 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, return_objects, return_alternates) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/rpcapi.py", line 158, in select_destinations 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return cctxt.call(ctxt, 'select_destinations', **msg_args) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 174, in call 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager retry=self.retry) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 131, in _send 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager timeout=timeout, retry=retry) 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File "/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 559, in send 2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager retry=retry) To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1776621/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp