Hey All, So I was working on Bug 1493520 which is about what happens when a controller runs out of space. For this I came up with a solution[1] to leverage pacemaker to migrate services away from the controller when it runs out of space. This works great for rabbitmq/mysql where if they run out of space "Bad Things Happen"™. But there is a problem that I didn't realize until QA got ahold of it. We run services (neutron-server,heat,nova-api,etc) on our controllers that are not managed via corosync so they still run and will attempt to serve requests via the haproxy that was conveniently moved off of the bad controller node.
So my question is, what should we do in this case? Should we be managing the start/stop of these services via Pacemaker? We would get additional benefits of having these services restarted if they crash and they would get stopped when the node health goes red like the rest of the services. But this will add additional complexity if anyone want to extract these services off of the controller to run on their own. Another solution would be to try and figure out some haproxy solution for the node. I'm not sure if haproxy has the concept of a node health check like some other load-balancing solutions do. If it did, we could just create a node health check that would down all the services with a single check that could query the cluster health status. Thoughts? Thanks, -Alex [0] https://bugs.launchpad.net/fuel/+bug/1493520 [1] https://review.openstack.org/#/c/226062/ __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev