On Tue, Jan 30, 2024 at 2:21 PM Walker, Chris <christopher.wal...@hpe.com> wrote:
> >>> However, now it seems to wait that amount of time before it elects a > >>> DC, even when quorum is acquired earlier. In my log snippet below, > >>> with dc-deadtime 300s, > >> > >> The dc-deadtime is not waiting for quorum, but for another DC to show > >> up. If all nodes show up, it can proceed, but otherwise it has to wait. > > > I believe all the nodes showed up by 14:17:04, but it still waited until > 14:19:26 to elect a DC: > > > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] > (peer_update_callback) info: Cluster node gopher12 is now membe (was in > unknown state) > > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (peer_update_callback) info: Cluster node gopher11 is now membe (was in > unknown state) > > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (quorum_notification_cb) notice: Quorum acquired | membership=54 members=2 > > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: > Input I_ELECTION_DC received in state S_ELECTION from election_win_cb > > > This is a cluster with 2 nodes, gopher11 and gopher12. > > This is our experience with dc-deadtime too: even if both nodes in the > cluster show up, dc-deadtime must elapse before the cluster starts. This > was discussed on this list a while back ( > https://www.mail-archive.com/users@clusterlabs.org/msg03897.html) and an > RFE came out of it (https://bugs.clusterlabs.org/show_bug.cgi?id=5310). > > > > I’ve worked around this by having an ExecStartPre directive for Corosync > that does essentially: > > > > while ! systemctl -H ${peer} is-active corosync; do sleep 5; done > > > > With this in place, the nodes wait for each other before starting Corosync > and Pacemaker. We can then use the default 20s dc-deadtime so that the DC > election happens quickly once both nodes are up. > Actually wait-for-all coming per default with 2-node should lead to quorum being delayed till both nodes showed up. And if we make the cluster not ignore quorum it shouldn't start fencing before it sees the peer - right? Running a 2-node-cluster ignoring quorum or without wait-for-all is a delicate thing anyway I would say and shouldn't work in a generic case. Not saying it is an issue here - guess there just isn't enough info about the cluster to say. So you shouldn't need this raised dc-deadtime and thus wouldn't experience large startup-delays. Regards, Klaus > Thanks, > > Chris > > > > *From: *Users <users-boun...@clusterlabs.org> on behalf of Faaland, Olaf > P. via Users <users@clusterlabs.org> > *Date: *Monday, January 29, 2024 at 7:46 PM > *To: *Ken Gaillot <kgail...@redhat.com>, Cluster Labs - All topics > related to open-source clustering welcomed <users@clusterlabs.org> > *Cc: *Faaland, Olaf P. <faala...@llnl.gov> > *Subject: *Re: [ClusterLabs] controlling cluster behavior on startup > > >> However, now it seems to wait that amount of time before it elects a > >> DC, even when quorum is acquired earlier. In my log snippet below, > >> with dc-deadtime 300s, > > > > The dc-deadtime is not waiting for quorum, but for another DC to show > > up. If all nodes show up, it can proceed, but otherwise it has to wait. > > I believe all the nodes showed up by 14:17:04, but it still waited until > 14:19:26 to elect a DC: > > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] > (peer_update_callback) info: Cluster node gopher12 is now membe (was in > unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (peer_update_callback) info: Cluster node gopher11 is now membe (was in > unknown state) > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > (quorum_notification_cb) notice: Quorum acquired | membership=54 members=2 > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: > Input I_ELECTION_DC received in state S_ELECTION from election_win_cb > > This is a cluster with 2 nodes, gopher11 and gopher12. > > Am I misreading that? > > thanks, > Olaf > > ________________________________________ > From: Ken Gaillot <kgail...@redhat.com> > Sent: Monday, January 29, 2024 3:49 PM > To: Faaland, Olaf P.; Cluster Labs - All topics related to open-source > clustering welcomed > Subject: Re: [ClusterLabs] controlling cluster behavior on startup > > On Mon, 2024-01-29 at 22:48 +0000, Faaland, Olaf P. wrote: > > Thank you, Ken. > > > > I changed my configuration management system to put an initial > > cib.xml into /var/lib/pacemaker/cib/, which sets all the property > > values I was setting via pcs commands, including dc-deadtime. I > > removed those "pcs property set" commands from the ones that are run > > at startup time. > > > > That worked in the sense that after Pacemaker start, the node waits > > my newly specified dc-deadtime of 300s before giving up on the > > partner node and fencing it, if the partner never appears as a > > member. > > > > However, now it seems to wait that amount of time before it elects a > > DC, even when quorum is acquired earlier. In my log snippet below, > > with dc-deadtime 300s, > > The dc-deadtime is not waiting for quorum, but for another DC to show > up. If all nodes show up, it can proceed, but otherwise it has to wait. > > > > > 14:14:24 Pacemaker starts on gopher12 > > 14:17:04 quorum is acquired > > 14:19:26 Election Trigger just popped (start time + dc-deadtime > > seconds) > > 14:19:26 gopher12 wins the election > > > > Is there other configuration that needs to be present in the cib at > > startup time? > > > > thanks, > > Olaf > > > > === log extract using new system of installing partial cib.xml before > > startup > > Jan 29 14:14:24 gopher12 pacemakerd [123690] > > (main) notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 > > features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 default- > > concurrent-fencing generated-manpages monotonic nagios ncurses remote > > systemd > > Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] > > (attrd_start_election_if_needed) info: Starting an election to > > determine the writer > > Jan 29 14:14:25 gopher12 pacemaker-attrd [123695] > > (election_check) info: election-attrd won by local node > > Jan 29 14:14:25 gopher12 pacemaker-controld [123697] > > (peer_update_callback) info: Cluster node gopher12 is now member > > (was in unknown state) > > Jan 29 14:17:04 gopher12 pacemaker-controld [123697] > > (quorum_notification_cb) notice: Quorum acquired | membership=54 > > members=2 > > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > > (crm_timer_popped) info: Election Trigger just popped | > > input=I_DC_TIMEOUT time=300000ms > > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > > (do_log) warning: Input I_DC_TIMEOUT received in state S_PENDING > > from crm_timer_popped > > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > > (do_state_transition) info: State transition S_PENDING -> > > S_ELECTION | input=I_DC_TIMEOUT cause=C_TIMER_POPPED > > origin=crm_timer_popped > > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > > (election_check) info: election-DC won by local node > > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] (do_log) info: > > Input I_ELECTION_DC received in state S_ELECTION from election_win_cb > > Jan 29 14:19:26 gopher12 pacemaker-controld [123697] > > (do_state_transition) notice: State transition S_ELECTION -> > > S_INTEGRATION | input=I_ELECTION_DC cause=C_FSA_INTERNAL > > origin=election_win_cb > > Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] > > (recurring_op_for_active) info: Start 10s-interval monitor > > for gopher11_zpool on gopher11 > > Jan 29 14:19:26 gopher12 pacemaker-schedulerd[123696] > > (recurring_op_for_active) info: Start 10s-interval monitor > > for gopher12_zpool on gopher12 > > > > > > === initial cib.xml contents > > <cib crm_feature_set="3.19.0" validate-with="pacemaker-3.9" epoch="9" > > num_updates="0" admin_epoch="0" cib-last-written="Mon Jan 29 11:07:06 > > 2024" update-origin="gopher12" update-client="root" update- > > user="root" have-quorum="0" dc-uuid="2"> > > <configuration> > > <crm_config> > > <cluster_property_set id="cib-bootstrap-options"> > > <nvpair id="cib-bootstrap-options-stonith-action" > > name="stonith-action" value="off"/> > > <nvpair id="cib-bootstrap-options-have-watchdog" name="have- > > watchdog" value="false"/> > > <nvpair id="cib-bootstrap-options-dc-version" name="dc- > > version" value="2.1.7-1.t4-2.1.7"/> > > <nvpair id="cib-bootstrap-options-cluster-infrastructure" > > name="cluster-infrastructure" value="corosync"/> > > <nvpair id="cib-bootstrap-options-cluster-name" > > name="cluster-name" value="gopher11"/> > > <nvpair id="cib-bootstrap-options-cluster-recheck-inte" > > name="cluster-recheck-interval" value="60"/> > > <nvpair id="cib-bootstrap-options-start-failure-is-fat" > > name="start-failure-is-fatal" value="false"/> > > <nvpair id="cib-bootstrap-options-dc-deadtime" name="dc- > > deadtime" value="300"/> > > </cluster_property_set> > > </crm_config> > > <nodes> > > <node id="1" uname="gopher11"/> > > <node id="2" uname="gopher12"/> > > </nodes> > > <resources/> > > <constraints/> > > </configuration> > > </cib> > > > > ________________________________________ > > From: Ken Gaillot <kgail...@redhat.com> > > Sent: Monday, January 29, 2024 10:51 AM > > To: Cluster Labs - All topics related to open-source clustering > > welcomed > > Cc: Faaland, Olaf P. > > Subject: Re: [ClusterLabs] controlling cluster behavior on startup > > > > On Mon, 2024-01-29 at 18:05 +0000, Faaland, Olaf P. via Users wrote: > > > Hi, > > > > > > I have configured clusters of node pairs, so each cluster has 2 > > > nodes. The cluster members are statically defined in corosync.conf > > > before corosync or pacemaker is started, and quorum {two_node: 1} > > > is > > > set. > > > > > > When both nodes are powered off and I power them on, they do not > > > start pacemaker at exactly the same time. The time difference may > > > be > > > a few minutes depending on other factors outside the nodes. > > > > > > My goals are (I call the first node to start pacemaker "node1"): > > > 1) I want to control how long pacemaker on node1 waits before > > > fencing > > > node2 if node2 does not start pacemaker. > > > 2) If node1 is part-way through that waiting period, and node2 > > > starts > > > pacemaker so they detect each other, I would like them to proceed > > > immediately to probing resource state and starting resources which > > > are down, not wait until the end of that "grace period". > > > > > > It looks from the documentation like dc-deadtime is how #1 is > > > controlled, and #2 is expected normal behavior. However, I'm > > > seeing > > > fence actions before dc-deadtime has passed. > > > > > > Am I misunderstanding Pacemaker's expected behavior and/or how dc- > > > deadtime should be used? > > > > You have everything right. The problem is that you're starting with > > an > > empty configuration every time, so the default dc-deadtime is being > > used for the first election (before you can set the desired value). > > > > I can't think of anything you can do to get around that, since the > > controller starts the timer as soon as it starts up. Would it be > > possible to bake an initial configuration into the PXE image? > > > > When the timer value changes, we could stop the existing timer and > > restart it. There's a risk that some external automation could make > > repeated changes to the timeout, thus never letting it expire, but > > that > > seems preferable to your problem. I've created an issue for that: > > > > > > https://urldefense.us/v3/__https:/projects.clusterlabs.org/T764 > > > > BTW there's also election-timeout. I'm not sure offhand how that > > interacts; it might be necessary to raise that one as well. > > > > > One possibly unusual aspect of this cluster is that these two nodes > > > are stateless - they PXE boot from an image on another server - and > > > I > > > build the cluster configuration at boot time with a series of pcs > > > commands, because the nodes have no local storage for this > > > purpose. The commands are: > > > > > > ['pcs', 'cluster', 'start'] > > > ['pcs', 'property', 'set', 'stonith-action=off'] > > > ['pcs', 'property', 'set', 'cluster-recheck-interval=60'] > > > ['pcs', 'property', 'set', 'start-failure-is-fatal=false'] > > > ['pcs', 'property', 'set', 'dc-deadtime=300'] > > > ['pcs', 'stonith', 'create', 'fence_gopher11', 'fence_powerman', > > > 'ip=192.168.64.65', 'pcmk_host_check=static-list', > > > 'pcmk_host_list=gopher11,gopher12'] > > > ['pcs', 'stonith', 'create', 'fence_gopher12', 'fence_powerman', > > > 'ip=192.168.64.65', 'pcmk_host_check=static-list', > > > 'pcmk_host_list=gopher11,gopher12'] > > > ['pcs', 'resource', 'create', 'gopher11_zpool', 'ocf:llnl:zpool', > > > 'import_options="-f -N -d /dev/disk/by-vdev"', 'pool=gopher11', > > > 'op', > > > 'start', 'timeout=805'] > > > ... > > > ['pcs', 'property', 'set', 'no-quorum-policy=ignore'] > > > > BTW you don't need to change no-quorum-policy when you're using > > two_node with Corosync. > > > > > I could, instead, generate a CIB so that when Pacemaker is started, > > > it has a full config. Is that better? > > > > > > thanks, > > > Olaf > > > > > > === corosync.conf: > > > totem { > > > version: 2 > > > cluster_name: gopher11 > > > secauth: off > > > transport: udpu > > > } > > > nodelist { > > > node { > > > ring0_addr: gopher11 > > > name: gopher11 > > > nodeid: 1 > > > } > > > node { > > > ring0_addr: gopher12 > > > name: gopher12 > > > nodeid: 2 > > > } > > > } > > > quorum { > > > provider: corosync_votequorum > > > two_node: 1 > > > } > > > > > > === Log excerpt > > > > > > Here's an except from Pacemaker logs that reflect what I'm > > > seeing. These are from gopher12, the node that came up first. The > > > other node, which is not yet up, is gopher11. > > > > > > Jan 25 17:55:38 gopher12 pacemakerd [116033] > > > (main) notice: Starting Pacemaker 2.1.7-1.t4 | build=2.1.7 > > > features:agent-manpages ascii-docs compat-2.0 corosync-ge-2 > > > default- > > > concurrent-fencing generated-manpages monotonic nagios ncurses > > > remote > > > systemd > > > Jan 25 17:55:39 gopher12 pacemaker-controld [116040] > > > (peer_update_callback) info: Cluster node gopher12 is now member > > > (was in unknown state) > > > Jan 25 17:55:43 gopher12 pacemaker-based [116035] > > > (cib_perform_op) info: ++ > > > /cib/configuration/crm_config/cluster_property_set[@id='cib- > > > bootstrap-options']: <nvpair id="cib-bootstrap-options-dc- > > > deadtime" > > > name="dc-deadtime" value="300"/> > > > Jan 25 17:56:00 gopher12 pacemaker-controld [116040] > > > (crm_timer_popped) info: Election Trigger just popped | > > > input=I_DC_TIMEOUT time=300000ms > > > Jan 25 17:56:01 gopher12 pacemaker-based [116035] > > > (cib_perform_op) info: ++ > > > /cib/configuration/crm_config/cluster_property_set[@id='cib- > > > bootstrap-options']: <nvpair id="cib-bootstrap-options-no-quorum- > > > policy" name="no-quorum-policy" value="ignore"/> > > > Jan 25 17:56:01 gopher12 pacemaker-controld [116040] > > > (abort_transition_graph) info: Transition 0 aborted by cib- > > > bootstrap-options-no-quorum-policy doing create no-quorum- > > > policy=ignore: Configuration change | cib=0.26.0 > > > source=te_update_diff_v2:464 > > > path=/cib/configuration/crm_config/cluster_property_set[@id='cib- > > > bootstrap-options'] complete=true > > > Jan 25 17:56:01 gopher12 pacemaker-controld [116040] > > > (controld_execute_fence_action) notice: Requesting fencing (off) > > > targeting node gopher11 | action=11 timeout=60 > > > > > > > > > _______________________________________________ > > > Manage your subscription: > > > > https://urldefense.us/v3/__https:/lists.clusterlabs.org/mailman/listinfo/users > > > > > > ClusterLabs home: > > > https://urldefense.us/v3/__https:/www.clusterlabs.org/ > > > > > -- > > Ken Gaillot <kgail...@redhat.com> > > > -- > Ken Gaillot <kgail...@redhat.com> > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ >
_______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/