Hi,

I'm working on implementing a MariaDB resource-agent based on the mysql one.
The idea is to take advantage of new features in MariaDB, especially semi-synchronous replication and GTID.

GTID (Global Transaction ID) means that there is a counter that applies to the replicated databases, which is unique within the cluster (there can be multiple replication clusters with overlapping ID's).

Semi-synchronous replication means that the master will replicate synchronously to AT LEAST ONE slave, before actually performing the transaction. In theory there can be no data-loss due to a single node failure, a big improvement compared to the normal async replication in MariaDB.

These two sets of technologies should allow for quite a straightforward set of semantics in the resource-agent. On master failure, the node with the highest GTID must be the one that was replicating synchronously, and should be promoted to be the new master. The question is how to relay the information to crmd.

My current working hypothesis is that I can place the GTID as a crm-attribute both when starting the resource-agent and in a post-demote notify. During the subsequent monitor operation the resource-agents can then scan the the crm-attributes from other nodes and simply prioritise themselves in relation to others (some relative scoring?).

This requires a few things though:

- If there is no master when the resource agent starts we need to wait for all nodes to come online (i.e) the cluster is just starting before promoting any to master, so they can read GTID from the attributes. - There must be a monitor step after start and demote and before the promotion of any resource to master, and this must execute on all nodes so they can set their priority for promotion. - The post-demote notifier must complete execution before a node can start the monitor operation. I THINK that it is ok for not all nodes to have completed the post-demote notifier before the monitor operation starts, probably this can work by creating a sparse priority distribution, i.e. First node to execute monitor sets a priority of 100 - the next one down 90 - the next one in the middle at 95, based on the number of nodes etc.

I hope this doesn't sound too tangled, I will try this out, but I can't find any clear documentation on the ordering and completion of start, notifiers, monitor and promote operations as well as master selection, so all pointers are very much welcome.

And completely alternative suggestions also very much welcome.

Thanks for any and all assistance,
Nils


_______________________________________________
Developers mailing list
[email protected]
http://lists.clusterlabs.org/mailman/listinfo/developers

Reply via email to