[Openstack-operators] [nova] nova-compute automatically disabling itself?

Chris Apsey Wed, 31 Jan 2018 13:21:07 -0800

All,

Running in to a strange issue I haven't seen before.

Randomly, the nova-compute services on compute nodes are disablingthemselves (as if someone ran openstack compute service set --disablehostX nova-compute. When this happens, the node continues to reportitself as 'up' - the service is just disabled. As a result, if enoughof these occur, we get scheduling errors due to lack of availableresources (which makes sense). Re-enabling them works just fine andthey continue on as if nothing happened. I looked through the logs andI can find the API calls where we re-enable the services (PUT/v2.1/os-services/enable), but I do not see any API calls where theservices are getting disabled initially.

Is anyone aware of any cases where compute nodes will automaticallydisable their nova-compute service on their own, or has anyone seen thisbefore and might know a root cause? We have plenty of spare vcpus andRAM on each node - like less than 25% utilization (both in absoluteterms and in terms of applied ratios).

We're seeing follow-on errors regarding rmq messages getting lost andvif-plug failures, but we think those are a symptom, not a cause.


Currently running pike on Xenial.

---
v/r

Chris Apsey
bitskr...@bitskrieg.net
https://www.bitskrieg.net

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

[Openstack-operators] [nova] nova-compute automatically disabling itself?

Reply via email to