Public bug reported:

When the nova power sync pool is exhausted the compute service will go
down. This results in scale and performance tests failing.

2018-06-12 19:58:48.871 30126 WARNING oslo.messaging._drivers.impl_rabbit 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during 
heartbeart thread processing, retrying...: error: [Errno 104] Connection reset 
by peer
2018-06-12 19:58:48.872 30126 WARNING oslo.messaging._drivers.impl_rabbit 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during 
heartbeart thread processing, retrying...: error: [Errno 104] Connection reset 
by peer
2018-06-12 19:58:54.793 30126 WARNING oslo.messaging._drivers.impl_rabbit 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during 
heartbeart thread processing, retrying...: error: [Errno 104] Connection reset 
by peer
2018-06-12 21:37:23.805 30126 DEBUG oslo_concurrency.lockutils 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Lock "compute_resources" 
released by "nova.compute.resource_tracker._update_available_resource" :: held 
6004.943s inner 
/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:288
2018-06-12 21:37:23.807 30126 ERROR nova.compute.manager 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Error updating resources 
for node domain-c7.fd3d2358-cc8d-4773-9fef-7a2713ac05ba.: MessagingTimeout: 
Timed out waiting for a reply to message ID 1eb4b1b40f0f4c66b0266608073717e8

root@controller01:/var/log/nova# vi nova-conductor.log.1
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager 
[req-77b5e1d7-a4b7-468e-98af-dfdfbf2fad7f 1b5d8da24b39464cb6736d122ccc0665 
eb361d7bc9bd40059a2ce2848c985772 - default default] Failed to schedule 
instances: NoValidHost_Remote: No valid host was found. There are not enough 
hosts available.
Traceback (most recent call last):

  File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 
226, in inner
    return func(*args, **kwargs)

  File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 153, 
in select_destinations
    allocation_request_version, return_alternates)

  File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", 
line 93, in select_destinations
    allocation_request_version, return_alternates)

  File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", 
line 245, in _schedule
    claimed_instance_uuids)

  File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", 
line 282, in _ensure_sufficient_hosts
    raise exception.NoValidHost(reason=reason)

NoValidHost: No valid host was found. There are not enough hosts available.
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager Traceback (most 
recent call last):
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 1118, in 
schedule_and_build_instances
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, 
return_alternates=True)
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 718, in 
_schedule_instances
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager 
return_alternates=return_alternates)
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/utils.py", line 727, in wrapped
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return func(*args, 
**kwargs)
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 53, 
in select_destinations
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, 
return_objects, return_alternates)
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 37, 
in __run_method
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return 
getattr(self.instance, __name)(*args, **kwargs)
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/client/query.py", line 42, in 
select_destinations
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, 
return_objects, return_alternates)
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/rpcapi.py", line 158, in 
select_destinations
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return 
cctxt.call(ctxt, 'select_destinations', **msg_args)
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 174, in 
call
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager retry=self.retry)
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 131, in 
_send
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager timeout=timeout, 
retry=retry)
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 
559, in send
2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager retry=retry)

** Affects: nova
     Importance: Undecided
     Assignee: Gary Kotton (garyk)
         Status: In Progress

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1776621

Title:
  Scale: when periodic pool size is small and there is a lot of load the
  compute service goes down

Status in OpenStack Compute (nova):
  In Progress

Bug description:
  When the nova power sync pool is exhausted the compute service will go
  down. This results in scale and performance tests failing.

  2018-06-12 19:58:48.871 30126 WARNING oslo.messaging._drivers.impl_rabbit 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during 
heartbeart thread processing, retrying...: error: [Errno 104] Connection reset 
by peer
  2018-06-12 19:58:48.872 30126 WARNING oslo.messaging._drivers.impl_rabbit 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during 
heartbeart thread processing, retrying...: error: [Errno 104] Connection reset 
by peer
  2018-06-12 19:58:54.793 30126 WARNING oslo.messaging._drivers.impl_rabbit 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Unexpected error during 
heartbeart thread processing, retrying...: error: [Errno 104] Connection reset 
by peer
  2018-06-12 21:37:23.805 30126 DEBUG oslo_concurrency.lockutils 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Lock "compute_resources" 
released by "nova.compute.resource_tracker._update_available_resource" :: held 
6004.943s inner 
/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:288
  2018-06-12 21:37:23.807 30126 ERROR nova.compute.manager 
[req-196321bb-a11a-4e6e-a80a-544ecd093986 c3de6d9ec02c494d978330d8f1a64da1 
d37803befc35418981f1f0b6dceec696 - default default] Error updating resources 
for node domain-c7.fd3d2358-cc8d-4773-9fef-7a2713ac05ba.: MessagingTimeout: 
Timed out waiting for a reply to message ID 1eb4b1b40f0f4c66b0266608073717e8

  root@controller01:/var/log/nova# vi nova-conductor.log.1
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager 
[req-77b5e1d7-a4b7-468e-98af-dfdfbf2fad7f 1b5d8da24b39464cb6736d122ccc0665 
eb361d7bc9bd40059a2ce2848c985772 - default default] Failed to schedule 
instances: NoValidHost_Remote: No valid host was found. There are not enough 
hosts available.
  Traceback (most recent call last):

    File "/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 
226, in inner
      return func(*args, **kwargs)

    File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 
153, in select_destinations
      allocation_request_version, return_alternates)

    File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", 
line 93, in select_destinations
      allocation_request_version, return_alternates)

    File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", 
line 245, in _schedule
      claimed_instance_uuids)

    File "/usr/lib/python2.7/dist-packages/nova/scheduler/filter_scheduler.py", 
line 282, in _ensure_sufficient_hosts
      raise exception.NoValidHost(reason=reason)

  NoValidHost: No valid host was found. There are not enough hosts available.
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager Traceback (most 
recent call last):
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 1118, in 
schedule_and_build_instances
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, 
return_alternates=True)
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 718, in 
_schedule_instances
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager 
return_alternates=return_alternates)
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/utils.py", line 727, in wrapped
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return func(*args, 
**kwargs)
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 53, 
in select_destinations
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, 
return_objects, return_alternates)
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 37, 
in __run_method
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return 
getattr(self.instance, __name)(*args, **kwargs)
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/client/query.py", line 42, in 
select_destinations
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager instance_uuids, 
return_objects, return_alternates)
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/nova/scheduler/rpcapi.py", line 158, in 
select_destinations
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager return 
cctxt.call(ctxt, 'select_destinations', **msg_args)
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 174, in 
call
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager retry=self.retry)
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 131, in 
_send
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager timeout=timeout, 
retry=retry)
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager File 
"/usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 
559, in send
  2018-06-12 20:48:10.161 6328 ERROR nova.conductor.manager retry=retry)

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1776621/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to