On Fri, Dec 16, 2016 at 7:26 AM, Ken Gaillot <kgail...@redhat.com> wrote: > On 12/15/2016 02:00 PM, Chris Walker wrote: >> Hello, >> >> I have a quick question about dc-deadtime. I believe that Digimer and >> others on this list might have already addressed this, but I want to >> make sure I'm not missing something. >> >> If my understanding is correct, dc-deadtime sets the amount of time that >> must elapse before a cluster is formed (DC is elected, etc), regardless >> of which nodes have joined the cluster. In other words, even if all >> nodes that are explicitly enumerated in the nodelist section have >> started Pacemaker, they will still wait dc-deadtime before forming a >> cluster. >> >> In my case, I have a two-node cluster on which I'd like to allow a >> pretty long time (~5 minutes) for both nodes to join before giving up on >> them. However, if they both join quickly, I'd like to proceed to form a >> cluster immediately; I don't want to wait for the full five minutes to >> elapse before forming a cluster. Further, if a node doesn't respond >> within five minutes, I want to fence it and start resources on the node >> that is up. > > Pacemaker+corosync behaves as you describe by default. > > dc-deadtime is how long to wait for an election to finish, but if the > election finishes sooner than that (i.e. a DC is elected), it stops > waiting. It doesn't even wait for all nodes, just a quorum.
You're confusing dc_deadtime with election_timeout: ./crmd/control.c:899: { XML_CONFIG_ATTR_DC_DEADTIME, "dc_deadtime", "time", NULL, "20s", &check_time, ./crmd/control.c-900- "How long to wait for a response from other nodes during startup.", ./crmd/control.c-901- "The \"correct\" value will depend on the speed/load of your network and the type of switches used." ./crmd/control.c-902- }, ./crmd/control.c:934: { XML_CONFIG_ATTR_ELECTION_FAIL, "election_timeout", "time", NULL, "2min", &check_timer, ./crmd/control.c-935- "*** Advanced Use Only ***.", "If need to adjust this value, it probably indicates the presence of a bug." ./crmd/control.c-936- }, "during startup" is incomplete though... we also start that timer after partition changes in case the DC was one of the nodes lost. > > Also, with startup-fencing=true (the default), any unseen nodes will be > fenced, and the remaining nodes will proceed to host resources. Of > course, it needs quorum for this, too. > > With two nodes, quorum is handled specially, but that's a different topic. > >> With Pacemaker/Heartbeat, the initdead parameter did exactly what I >> want, but I don't see any way to do this with Pacemaker/Corosync. From >> reading other posts, it looks like people use an external agent to start >> HA daemons once nodes are up ... is this a correct understanding? >> >> Thanks very much, >> Chris > > _______________________________________________ > Users mailing list: Users@clusterlabs.org > http://lists.clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org