During the design of HA deployments for Neutron, I have found that agent's could run into problems, and they keep running, but they have no methods to expose status to parent process or which could be queried via an init.d script.
So I'm proposing this blueprint, https://blueprints.launchpad.net/neutron/+spec/agent-service-status to make agent's expose internal status conditions via filesystem as an extension of the current pid file. This way, permanent or transient error conditions could be handled by standard monitoring (or HA) solutions, to notify or take action as appropriate. It's a simple change that can make HA deployment's more robust, and capable of handling situations like this: (If neutron spawned dnsmasq dies, neutron-dhcp-agent will be totally unaware) https://bugs.launchpad.net/neutron/+bug/1257524 We have the exact same problem with the other agents and sub-processes. So I'm interested in getting this done for icehouse-3. Any feedback? Best regards, Miguel Ángel Ajo. _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev