Hi,

We have a 5 node openstack rdo newton setup with one controller node.

Compute service status is getting fluctuated for all nodes. For few seconds
its showing up and then it's getting down in nova-service-list.

Here is the snap of it.

[root@controller-internal ~]# nova service-list
+-----+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| Id  | Binary           | Host                | Zone     | Status  | State
| Updated_at                 | Disabled Reason |
+-----+------------------+---------------------+----------+---------+-------+----------------------------+-----------------+
| 5   | nova-compute     | redhat-compute1     | compute1 | enabled | up
| 2017-11-20T10:43:35.000000 | -               |
| 12  | nova-conductor   | controller-internal | internal | enabled | up
| 2017-11-20T10:43:51.000000 | -               |
| 15  | nova-consoleauth | controller-internal | internal | enabled | up
| 2017-11-20T10:43:54.000000 | -               |
| 16  | nova-cert        | controller-internal | internal | enabled | up
| 2017-11-20T10:43:54.000000 | -               |
| 17  | nova-scheduler   | controller-internal | internal | enabled | up
| 2017-11-20T10:43:58.000000 | -               |
| 18  | nova-compute     | redhat-compute2     | compute2 | enabled | down
| 2017-11-20T10:40:47.000000 | -               |
| 19  | nova-compute     | redhat-compute3     | compute3 | enabled | down
| 2017-11-20T10:41:24.000000 | -               |
| 20  | nova-compute     | redhat-compute4     | compute4 | enabled | down
| 2017-11-19T17:11:26.000000 | -               |
| 21  | nova-compute     | redhat-compute5     | compute5 | enabled | down
| 2017-11-20T10:40:19.000000 | -               |
-----------------------------------------------------------------------------

>From rabbitmq logs I can see few reply exchanges are missing and also its
showing heartbeat_timeout error.
----------------------------------------------------------------------------------------------------------

=ERROR REPORT==== 20-Nov-2017::05:26:24 ===
connection <0.4177.0>, channel 1 - soft error:
{amqp_error,not_found,
            "no exchange 'reply_6cd8b245e6b84bc3a80cfc07da243b79' in vhost
'/'",
            'exchange.declare'}

=ERROR REPORT==== 20-Nov-2017::05:26:25 ===
closing AMQP connection <0.2725.0> (10.0.0.2:59658 -> 10.0.0.2:5672):
{heartbeat_timeout,running}
----------------------------------------------------------------------------------------------------------

I can see this exchange is available in rabbitmq. Don't know how it
couldn't connect to it. Telnet to mq port is working fine from compute
node. No firewall enabled.

[root@controller-internal ~]# rabbitmqctl list_queues |grep
reply_6cd8b245e6b84bc3a80cfc07da243b79
reply_6cd8b245e6b84bc3a80cfc07da243b79  0
[root@controller-internal ~]#

Ant suggestions or troubleshooting steps are welcome.



Thanks & Regards,
Shyam Biradar,
Email: shyambiradarsgg...@gmail.com,
Contact: +91 8600266938.
_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to