Hi, We have a 5 node openstack rdo newton setup with one controller node.
Compute service status is getting fluctuated for all nodes. For few seconds its showing up and then it's getting down in nova-service-list. Here is the snap of it. [root@controller-internal ~]# nova service-list +-----+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+ | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | +-----+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+ | 5 | nova-compute | redhat-compute1 | compute1 | enabled | up | 2017-11-20T10:43:35.000000 | - | | 12 | nova-conductor | controller-internal | internal | enabled | up | 2017-11-20T10:43:51.000000 | - | | 15 | nova-consoleauth | controller-internal | internal | enabled | up | 2017-11-20T10:43:54.000000 | - | | 16 | nova-cert | controller-internal | internal | enabled | up | 2017-11-20T10:43:54.000000 | - | | 17 | nova-scheduler | controller-internal | internal | enabled | up | 2017-11-20T10:43:58.000000 | - | | 18 | nova-compute | redhat-compute2 | compute2 | enabled | down | 2017-11-20T10:40:47.000000 | - | | 19 | nova-compute | redhat-compute3 | compute3 | enabled | down | 2017-11-20T10:41:24.000000 | - | | 20 | nova-compute | redhat-compute4 | compute4 | enabled | down | 2017-11-19T17:11:26.000000 | - | | 21 | nova-compute | redhat-compute5 | compute5 | enabled | down | 2017-11-20T10:40:19.000000 | - | ----------------------------------------------------------------------------- >From rabbitmq logs I can see few reply exchanges are missing and also its showing heartbeat_timeout error. ---------------------------------------------------------------------------------------------------------- =ERROR REPORT==== 20-Nov-2017::05:26:24 === connection <0.4177.0>, channel 1 - soft error: {amqp_error,not_found, "no exchange 'reply_6cd8b245e6b84bc3a80cfc07da243b79' in vhost '/'", 'exchange.declare'} =ERROR REPORT==== 20-Nov-2017::05:26:25 === closing AMQP connection <0.2725.0> (10.0.0.2:59658 -> 10.0.0.2:5672): {heartbeat_timeout,running} ---------------------------------------------------------------------------------------------------------- I can see this exchange is available in rabbitmq. Don't know how it couldn't connect to it. Telnet to mq port is working fine from compute node. No firewall enabled. [root@controller-internal ~]# rabbitmqctl list_queues |grep reply_6cd8b245e6b84bc3a80cfc07da243b79 reply_6cd8b245e6b84bc3a80cfc07da243b79 0 [root@controller-internal ~]# Ant suggestions or troubleshooting steps are welcome. Thanks & Regards, Shyam Biradar, Email: shyambiradarsgg...@gmail.com, Contact: +91 8600266938.
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack