Take this with a grain of salt because we're using the original version before the project moved under the Big Tent and I'm not sure how much it's evolved since then. I assume the basic functions are the same though.
You're correct; Corosync and Pacemaker are used to determine if a compute node goes down. The masakari-host-monitor process runs on each compute node and checks the cluster status and sends a notification to masakari-controller when a node goes down. The controller process keeps a list of reserved hosts in it's database and calls nova host-evacuate to move the Instances to one of the reserved hosts. In our environment I also configured STONITH and I'd highly recommend it. With STONITH Pacemaker sends a shutdown command to the Out of Band Management card of the unreachable node to make sure that it can't come back and cause a conflict. There are two other components, masakari-process-monitor and masakari-instance-monitor. These also run on your compute nodes. The former watches the nova-compute service and the later monitors running instances and restarts them if necessary. Looking here it seems they've split Masakari into thee different repos: https://github.com/openstack?utf8=%E2%9C%93&q=masakari&type=&language= masakari - The controller service and API masakari-monitors - Compute node monitoring services python-masakari-client - The cli tools
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack