On 02/14/2017 02:51 PM, Nils Carlson wrote: > Hi, > > I'm working on implementing a MariaDB resource-agent based on the mysql > one. > The idea is to take advantage of new features in MariaDB, especially > semi-synchronous replication and GTID. > > GTID (Global Transaction ID) means that there is a counter that applies > to the replicated databases, which is unique within the cluster (there > can be multiple replication clusters with overlapping ID's). > > Semi-synchronous replication means that the master will replicate > synchronously to AT LEAST ONE slave, before actually performing the > transaction. In theory there can be no data-loss due to a single node > failure, a big improvement compared to the normal async replication in > MariaDB. > > These two sets of technologies should allow for quite a straightforward > set of semantics in the resource-agent. > On master failure, the node with the highest GTID must be the one that > was replicating synchronously, and should be promoted to be the new > master. The question is how to relay the information to crmd. > > My current working hypothesis is that I can place the GTID as a > crm-attribute both when starting the resource-agent and in a post-demote > notify. During the subsequent monitor operation the resource-agents can > then scan the the crm-attributes from other nodes and simply prioritise > themselves in relation to others (some relative scoring?).
A bit of a tangent: you can set attributes from a resource agent using either crm_attribute or attrd_updater. Each has advantages and disadvantages. crm_attribute can set a permanent or transient attribute, while attrd_updater only sets transient attributes. (A node's transient attributes go away when the node reboots or otherwise stops cluster services.) crm_attribute can only set public attributes, while attrd_updater can set public or private attributes. Public attributes are recorded in the CIB, and when they are changed, it triggers a new transition (i.e. the cluster checks to see if any resources need to be started/stopped/moved). Private attributes are not saved to the CIB, and do not cause a new transition. Public attributes can be referenced in constraint rules, while private attributes cannot. Private attributes have been supported since Pacemaker 1.1.13. attrd_updater works with Pacemaker Remote nodes only when the cluster nodes use the corosync 2 stack. It will silently be ignored for Pacemaker Remote nodes when the cluster nodes use a legacy stack (heartbeat/cman/corosync-plugin). crm_attribute works with remote nodes on legacy stacks since Pacemaker 1.1.15. I'd prefer attrd_updater with private transient attributes if that works for your purposes, because it saves unnecessary recalculation of the cluster state plus disk I/O. > This requires a few things though: > > - If there is no master when the resource agent starts we need to wait > for all nodes to come online (i.e) the cluster is just starting before > promoting any to master, so they can read GTID from the attributes. > - There must be a monitor step after start and demote and before the > promotion of any resource to master, and this must execute on all nodes > so they can set their priority for promotion. > - The post-demote notifier must complete execution before a node can > start the monitor operation. I THINK that it is ok for not all nodes to > have completed the post-demote notifier before the monitor operation > starts, probably this can work by creating a sparse priority > distribution, i.e. First node to execute monitor sets a priority of 100 > - the next one down 90 - the next one in the middle at 95, based on the > number of nodes etc. > > I hope this doesn't sound too tangled, I will try this out, but I can't > find any clear documentation on the ordering and completion of start, > notifiers, monitor and promote operations as well as master selection, > so all pointers are very much welcome. > > And completely alternative suggestions also very much welcome. > > Thanks for any and all assistance, > Nils You may want to look at the ocf:heartbeat:galera agent -- I believe it has some similar concerns. _______________________________________________ Developers mailing list [email protected] http://lists.clusterlabs.org/mailman/listinfo/developers
