On 07/03/24 02:28 +0000, Ali Shahbazifakhr via Users wrote:
Hello,

I am reaching out to inquire about the usage of Pacemaker on Google Compute 
Engine (GCE), specifically in conjunction with Managed Instance Groups (MIG). 
Our team is currently exploring options for implementing high availability and 
failover solutions within our infrastructure on GCE, and we believe that 
Pacemaker may be a viable option for achieving this.

Could you kindly provide some insights into how Pacemaker is utilized within 
the GCE environment, particularly in scenarios involving Managed Instance 
Groups? We are interested in understanding the design considerations and best 
practices for implementing Pacemaker with MIG instances.

Additionally, if there are any documentation resources available that explain 
the design and implementation of Pacemaker with MIG instances on GCE, we would 
greatly appreciate it if you could point us in the right direction
I dont have any experience with MIG, but from a quick look it seems
like it can be used to replace and/or autoscale, so I would suggest
not replacing the nodes (as Pacemaker takes care of badly behaving
nodes), and you will have to use "pcs host auth <hostname>" and "pcs
cluster node add <hostname>"/"pcs cluster node remove <hostname>" to
add/remove nodes if you use the autoscale functionality.

You can use fence_gce to fence (reboot) badly behaving nodes:
https://github.com/ClusterLabs/fence-agents/blob/main/agents/gce/fence_gce.py

and the gcp-* agents handle IPs, routes, disks, or load balancer(s):
https://github.com/ClusterLabs/resource-agents/tree/main/heartbeat

There is metadata/desc sections in the code of the agents, so you can
find all the info without having to install the packages.

If you're new to Pacemaker this is a good introduction:
https://www.clusterlabs.org/pacemaker/doc/2.1/Clusters_from_Scratch/singlehtml/

For software that doesnt have a resource agent you can let Pacemaker handle it
via it's systemd or init services/scripts, or make your own agent if
you need e.g. additional monitoring to check that the service is still
alive:
https://github.com/ClusterLabs/resource-agents/blob/main/doc/dev-guides/ra-dev-guide.asc


Oyvind

Looking forward to your response.


[CN100]
Ali Shahbazi
Specialist Enterprise Architecture | IoT Industrial, Solutions System 
Engineering |
T:  | C: 403-702-3093
What's New at CN<https://www.cn.ca/whatsnew> | Quoi de neuf au 
CN<https://www.cn.ca/quoi-de-neuf>



_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to