Re: [Pacemaker] query ?
On 29 Sep 2014, at 12:26 pm, Alex Samad - Yieldbroker wrote: > Cool, thanks > > Thought it might have been a normal check. If there is a problem we'll normally log it as 'error' or 'crit' > > A > >> -Original Message- >> From: renayama19661...@ybb.ne.jp >> [mailto:renayama19661...@ybb.ne.jp] >> Sent: Monday, 29 September 2014 12:20 PM >> To: The Pacemaker cluster resource manager >> Subject: Re: [Pacemaker] query ? >> >> Hi Alex, >> >> Because recheck_timer moves by default every 15 minutes, state transition >> is calculated in pengine. >> >> >> - >> { XML_CONFIG_ATTR_RECHECK, "cluster_recheck_interval", "time", >> "Zero disables polling. Positive values are an interval in seconds (unless >> other SI units are specified. eg. 5min)", "15min", &check_timer, >> "Polling interval for time based changes to options, resource parameters >> and constraints.", >> "The Cluster is primarily event driven, however the configuration can have >> elements that change based on time." >> " To ensure these changes take effect, we can optionally poll the cluster's >> status for changes." }, { "load-threshold", NULL, "percentage", NULL, "80%", >> &check_utilization, >> "The maximum amount of system resources that should be used by nodes >> in the cluster", >> "The cluster will slow down its recovery process when the amount of >> system resources used" >> " (currently CPU) approaches this limit", }, >> - >> >> Best Regards, >> Hideo Yamauchi. >> >> >> >> >> - Original Message - >>> From: Alex Samad - Yieldbroker >>> To: "pacemaker@oss.clusterlabs.org" >>> Cc: >>> Date: 2014/9/29, Mon 10:56 >>> Subject: [Pacemaker] query ? >>> >>> Hi >>> >>> Is this normal logging ? >>> >>> Not sure if I need to investigate any thing >>> >>> Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: do_state_transition: >>> State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC >>> cause=C_TIMER_POPPED origin=crm_timer_popped ] Sep 29 11:35:15 >> gsdmz1 >>> pengine[2480]: notice: unpack_config: On loss of CCM >>> Quorum: Ignore >>> Sep 29 11:35:15 gsdmz1 pengine[2480]: notice: process_pe_message: >>> Calculated Transition 196: /var/lib/pacer/pengine/pe-input-247.bz2 >>> Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: run_graph: Transition 196 >>> (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, >>> Source=/var/lib/pacer/pengine/pe-input-247.bz2): Complete Sep 29 >>> 11:35:15 gsdmz1 crmd[2481]: notice: do_state_transition: State >>> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS >>> cause=C_FSA_INTERNAL origin=notify_crmd ] Sep 29 11:50:15 gsdmz1 >>> crmd[2481]: notice: do_state_transition: State transition S_IDLE -> >>> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED >>> origin=crm_timer_popped ] Sep 29 11:50:15 gsdmz1 pengine[2480]: >>> notice: unpack_config: On loss of CCM >>> Quorum: Ignore >>> Sep 29 11:50:15 gsdmz1 pengine[2480]: notice: process_pe_message: >>> Calculated Transition 197: /var/lib/pacer/pengine/pe-input-247.bz2 >>> Sep 29 11:50:15 gsdmz1 crmd[2481]: notice: run_graph: Transition 197 >>> (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, >>> Source=/var/lib/pacer/pengine/pe-input-247.bz2): Complete Sep 29 >>> 11:50:15 gsdmz1 crmd[2481]: notice: do_state_transition: State >>> transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS >>> cause=C_FSA_INTERNAL origin=notify_crmd ] >>> >>> ___ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >>> >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org Getting started: >> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org signature.asc Description: Message signed with OpenPGP using GPGMail ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] query ?
Cool, thanks Thought it might have been a normal check. A > -Original Message- > From: renayama19661...@ybb.ne.jp > [mailto:renayama19661...@ybb.ne.jp] > Sent: Monday, 29 September 2014 12:20 PM > To: The Pacemaker cluster resource manager > Subject: Re: [Pacemaker] query ? > > Hi Alex, > > Because recheck_timer moves by default every 15 minutes, state transition > is calculated in pengine. > > > - > { XML_CONFIG_ATTR_RECHECK, "cluster_recheck_interval", "time", > "Zero disables polling. Positive values are an interval in seconds (unless > other SI units are specified. eg. 5min)", "15min", &check_timer, > "Polling interval for time based changes to options, resource parameters > and constraints.", > "The Cluster is primarily event driven, however the configuration can have > elements that change based on time." > " To ensure these changes take effect, we can optionally poll the cluster's > status for changes." }, { "load-threshold", NULL, "percentage", NULL, "80%", > &check_utilization, > "The maximum amount of system resources that should be used by nodes > in the cluster", > "The cluster will slow down its recovery process when the amount of > system resources used" > " (currently CPU) approaches this limit", }, > - > > Best Regards, > Hideo Yamauchi. > > > > > - Original Message - > > From: Alex Samad - Yieldbroker > > To: "pacemaker@oss.clusterlabs.org" > > Cc: > > Date: 2014/9/29, Mon 10:56 > > Subject: [Pacemaker] query ? > > > > Hi > > > > Is this normal logging ? > > > > Not sure if I need to investigate any thing > > > > Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: do_state_transition: > > State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC > > cause=C_TIMER_POPPED origin=crm_timer_popped ] Sep 29 11:35:15 > gsdmz1 > > pengine[2480]: notice: unpack_config: On loss of CCM > > Quorum: Ignore > > Sep 29 11:35:15 gsdmz1 pengine[2480]: notice: process_pe_message: > > Calculated Transition 196: /var/lib/pacer/pengine/pe-input-247.bz2 > > Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: run_graph: Transition 196 > > (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > > Source=/var/lib/pacer/pengine/pe-input-247.bz2): Complete Sep 29 > > 11:35:15 gsdmz1 crmd[2481]: notice: do_state_transition: State > > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > > cause=C_FSA_INTERNAL origin=notify_crmd ] Sep 29 11:50:15 gsdmz1 > > crmd[2481]: notice: do_state_transition: State transition S_IDLE -> > > S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED > > origin=crm_timer_popped ] Sep 29 11:50:15 gsdmz1 pengine[2480]: > > notice: unpack_config: On loss of CCM > > Quorum: Ignore > > Sep 29 11:50:15 gsdmz1 pengine[2480]: notice: process_pe_message: > > Calculated Transition 197: /var/lib/pacer/pengine/pe-input-247.bz2 > > Sep 29 11:50:15 gsdmz1 crmd[2481]: notice: run_graph: Transition 197 > > (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > > Source=/var/lib/pacer/pengine/pe-input-247.bz2): Complete Sep 29 > > 11:50:15 gsdmz1 crmd[2481]: notice: do_state_transition: State > > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > > cause=C_FSA_INTERNAL origin=notify_crmd ] > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] query ?
Hi Alex, Because recheck_timer moves by default every 15 minutes, state transition is calculated in pengine. - { XML_CONFIG_ATTR_RECHECK, "cluster_recheck_interval", "time", "Zero disables polling. Positive values are an interval in seconds (unless other SI units are specified. eg. 5min)", "15min", &check_timer, "Polling interval for time based changes to options, resource parameters and constraints.", "The Cluster is primarily event driven, however the configuration can have elements that change based on time." " To ensure these changes take effect, we can optionally poll the cluster's status for changes." }, { "load-threshold", NULL, "percentage", NULL, "80%", &check_utilization, "The maximum amount of system resources that should be used by nodes in the cluster", "The cluster will slow down its recovery process when the amount of system resources used" " (currently CPU) approaches this limit", }, - Best Regards, Hideo Yamauchi. - Original Message - > From: Alex Samad - Yieldbroker > To: "pacemaker@oss.clusterlabs.org" > Cc: > Date: 2014/9/29, Mon 10:56 > Subject: [Pacemaker] query ? > > Hi > > Is this normal logging ? > > Not sure if I need to investigate any thing > > Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED > origin=crm_timer_popped ] > Sep 29 11:35:15 gsdmz1 pengine[2480]: notice: unpack_config: On loss of CCM > Quorum: Ignore > Sep 29 11:35:15 gsdmz1 pengine[2480]: notice: process_pe_message: > Calculated > Transition 196: /var/lib/pacer/pengine/pe-input-247.bz2 > Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: run_graph: Transition 196 > (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pacer/pengine/pe-input-247.bz2): Complete > Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > Sep 29 11:50:15 gsdmz1 crmd[2481]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED > origin=crm_timer_popped ] > Sep 29 11:50:15 gsdmz1 pengine[2480]: notice: unpack_config: On loss of CCM > Quorum: Ignore > Sep 29 11:50:15 gsdmz1 pengine[2480]: notice: process_pe_message: > Calculated > Transition 197: /var/lib/pacer/pengine/pe-input-247.bz2 > Sep 29 11:50:15 gsdmz1 crmd[2481]: notice: run_graph: Transition 197 > (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pacer/pengine/pe-input-247.bz2): Complete > Sep 29 11:50:15 gsdmz1 crmd[2481]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] query ?
Hi Is this normal logging ? Not sure if I need to investigate any thing Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ] Sep 29 11:35:15 gsdmz1 pengine[2480]: notice: unpack_config: On loss of CCM Quorum: Ignore Sep 29 11:35:15 gsdmz1 pengine[2480]: notice: process_pe_message: Calculated Transition 196: /var/lib/pacer/pengine/pe-input-247.bz2 Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: run_graph: Transition 196 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacer/pengine/pe-input-247.bz2): Complete Sep 29 11:35:15 gsdmz1 crmd[2481]: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Sep 29 11:50:15 gsdmz1 crmd[2481]: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ] Sep 29 11:50:15 gsdmz1 pengine[2480]: notice: unpack_config: On loss of CCM Quorum: Ignore Sep 29 11:50:15 gsdmz1 pengine[2480]: notice: process_pe_message: Calculated Transition 197: /var/lib/pacer/pengine/pe-input-247.bz2 Sep 29 11:50:15 gsdmz1 crmd[2481]: notice: run_graph: Transition 197 (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pacer/pengine/pe-input-247.bz2): Complete Sep 29 11:50:15 gsdmz1 crmd[2481]: notice: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Master is restarted when other node comes online
В Sun, 28 Sep 2014 13:03:08 +0200 emmanuel segura пишет: > try to use interleave meta attribute in your clone definition > > http://www.hastexo.com/resources/hints-and-kinks/interleaving-pacemaker-clones > I appreciate if you could elaborate a bit more here. The above link explains dependencies between two cloned resources. I have single cloned resource, without any external dependency. I'm afraid I miss how interleave helps here. Thank you! > 2014-09-28 9:56 GMT+02:00 Andrei Borzenkov : > > I have two node cluster with single master/slave resource (replicated > > database) using pacemaker+openais on SLES11 SP3 (pacemaker > > 1.1.11-3ca8c3b). I hit weird situation that I did not see before, and > > I cannot really understand it. Assuming master runs on node A and > > slave runs on node B. If I stop cluster stack on B (rcopenais stop) > > and start it again (rcopenais start) master is restarted. Of course > > this means service interruption. Same happens if I reboot node B. I > > have crm_report and can provide logs which are required but I wanted > > first to quickly make sure this is not expected behavior. > > > > I have not seen it before, but now when I recall what I tested, it was > > always simulation of node failure. I did not really tried above > > scenario. > > > > Assuming this is correct behavior - what is the correct procedure to > > shutdown single node then? It makes it impossible to do any > > maintenance on slave node. > > > > Configuration below: > > > > node msksaphana1 \ > > attributes hana_hdb_vhost="msksaphana1" hana_hdb_site="SITE1" > > hana_hdb_remoteHost="msksaphana2" hana_hdb_srmode="sync" > > lpa_hdb_lpt="1411732740" > > node msksaphana2 \ > > attributes hana_hdb_vhost="msksaphana2" hana_hdb_site="SITE2" > > hana_hdb_srmode="sync" hana_hdb_remoteHost="msksaphana1" > > lpa_hdb_lpt="30" > > primitive rsc_SAPHanaTopology_HDB_HDB00 ocf:suse:SAPHanaTopology \ > > params SID="HDB" InstanceNumber="00" \ > > op monitor interval="10" timeout="600" \ > > op start interval="0" timeout="600" \ > > op stop interval="0" timeout="300" > > primitive rsc_SAPHana_HDB_HDB00 ocf:suse:SAPHana \ > > params SID="HDB" InstanceNumber="00" PREFER_SITE_TAKEOVER="true" > > AUTOMATED_REGISTER="true" DUPLICATE_PRIMARY_TIMEOUT="7200" \ > > op start timeout="3600" interval="0" \ > > op stop timeout="3600" interval="0" \ > > op promote timeout="3600" interval="0" \ > > op monitor timeout="700" role="Master" interval="60" \ > > op monitor timeout="700" role="Slave" interval="61" > > primitive rsc_ip_HDB_HDB00 ocf:heartbeat:IPaddr2 \ > > params ip="10.72.10.64" \ > > op start timeout="20" interval="0" \ > > op stop timeout="20" interval="0" \ > > op monitor interval="10" timeout="20" > > primitive stonith_IPMI_msksaphana1 stonith:external/ipmi \ > > params ipmitool="/usr/bin/ipmitool" hostname="msksaphana1" > > passwd="P@ssw0rd" userid="hacluster" ipaddr="10.72.5.47" \ > > op stop timeout="15" interval="0" \ > > op monitor timeout="20" interval="3600" \ > > op start timeout="20" interval="0" \ > > meta target-role="Started" > > primitive stonith_IPMI_msksaphana2 stonith:external/ipmi \ > > params ipmitool="/usr/bin/ipmitool" hostname="msksaphana2" > > passwd="P@ssw0rd" userid="hacluster" ipaddr="10.72.5.48" \ > > op stop timeout="15" interval="0" \ > > op monitor timeout="20" interval="3600" \ > > op start timeout="20" interval="0" \ > > meta target-role="Started" > > ms msl_SAPHana_HDB_HDB00 rsc_SAPHana_HDB_HDB00 \ > > meta clone-max="2" clone-node-max="1" target-role="Started" > > clone cln_SAPHanaTopology_HDB_HDB00 rsc_SAPHanaTopology_HDB_HDB00 \ > > meta is-managed="true" clone-node-max="1" target-role="Started" > > location stonoth_IPMI_msksaphana1_on_msksaphana2 > > stonith_IPMI_msksaphana1 -inf: msksaphana1 > > location stonoth_IPMI_msksaphana2_on_msksaphana1 > > stonith_IPMI_msksaphana2 -inf: msksaphana2 > > colocation col_saphana_ip_HDB_HDB00 2000: rsc_ip_HDB_HDB00:Started > > msl_SAPHana_HDB_HDB00:Master > > order ord_SAPHana_HDB_HDB00 2000: cln_SAPHanaTopology_HDB_HDB00 > > msl_SAPHana_HDB_HDB00 > > property $id="cib-bootstrap-options" \ > > stonith-enabled="true" \ > > placement-strategy="balanced" \ > > dc-version="1.1.11-3ca8c3b" \ > > cluster-infrastructure="classic openais (with plugin)" \ > > expected-quorum-votes="2" \ > > stonith-action="reboot" \ > > no-quorum-policy="ignore" \ > > last-lrm-refresh="1411730405" > > rsc_defaults $id="rsc-options" \ > > resource-stickiness="1" \ > > migration-threshold="3" > > op_defaults $id="op-options" \ > > timeout="600" \ > > record-pending="true" > > > > Thank you! > > > > -andrei > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.o
Re: [Pacemaker] Master is restarted when other node comes online
try to use interleave meta attribute in your clone definition http://www.hastexo.com/resources/hints-and-kinks/interleaving-pacemaker-clones 2014-09-28 9:56 GMT+02:00 Andrei Borzenkov : > I have two node cluster with single master/slave resource (replicated > database) using pacemaker+openais on SLES11 SP3 (pacemaker > 1.1.11-3ca8c3b). I hit weird situation that I did not see before, and > I cannot really understand it. Assuming master runs on node A and > slave runs on node B. If I stop cluster stack on B (rcopenais stop) > and start it again (rcopenais start) master is restarted. Of course > this means service interruption. Same happens if I reboot node B. I > have crm_report and can provide logs which are required but I wanted > first to quickly make sure this is not expected behavior. > > I have not seen it before, but now when I recall what I tested, it was > always simulation of node failure. I did not really tried above > scenario. > > Assuming this is correct behavior - what is the correct procedure to > shutdown single node then? It makes it impossible to do any > maintenance on slave node. > > Configuration below: > > node msksaphana1 \ > attributes hana_hdb_vhost="msksaphana1" hana_hdb_site="SITE1" > hana_hdb_remoteHost="msksaphana2" hana_hdb_srmode="sync" > lpa_hdb_lpt="1411732740" > node msksaphana2 \ > attributes hana_hdb_vhost="msksaphana2" hana_hdb_site="SITE2" > hana_hdb_srmode="sync" hana_hdb_remoteHost="msksaphana1" > lpa_hdb_lpt="30" > primitive rsc_SAPHanaTopology_HDB_HDB00 ocf:suse:SAPHanaTopology \ > params SID="HDB" InstanceNumber="00" \ > op monitor interval="10" timeout="600" \ > op start interval="0" timeout="600" \ > op stop interval="0" timeout="300" > primitive rsc_SAPHana_HDB_HDB00 ocf:suse:SAPHana \ > params SID="HDB" InstanceNumber="00" PREFER_SITE_TAKEOVER="true" > AUTOMATED_REGISTER="true" DUPLICATE_PRIMARY_TIMEOUT="7200" \ > op start timeout="3600" interval="0" \ > op stop timeout="3600" interval="0" \ > op promote timeout="3600" interval="0" \ > op monitor timeout="700" role="Master" interval="60" \ > op monitor timeout="700" role="Slave" interval="61" > primitive rsc_ip_HDB_HDB00 ocf:heartbeat:IPaddr2 \ > params ip="10.72.10.64" \ > op start timeout="20" interval="0" \ > op stop timeout="20" interval="0" \ > op monitor interval="10" timeout="20" > primitive stonith_IPMI_msksaphana1 stonith:external/ipmi \ > params ipmitool="/usr/bin/ipmitool" hostname="msksaphana1" > passwd="P@ssw0rd" userid="hacluster" ipaddr="10.72.5.47" \ > op stop timeout="15" interval="0" \ > op monitor timeout="20" interval="3600" \ > op start timeout="20" interval="0" \ > meta target-role="Started" > primitive stonith_IPMI_msksaphana2 stonith:external/ipmi \ > params ipmitool="/usr/bin/ipmitool" hostname="msksaphana2" > passwd="P@ssw0rd" userid="hacluster" ipaddr="10.72.5.48" \ > op stop timeout="15" interval="0" \ > op monitor timeout="20" interval="3600" \ > op start timeout="20" interval="0" \ > meta target-role="Started" > ms msl_SAPHana_HDB_HDB00 rsc_SAPHana_HDB_HDB00 \ > meta clone-max="2" clone-node-max="1" target-role="Started" > clone cln_SAPHanaTopology_HDB_HDB00 rsc_SAPHanaTopology_HDB_HDB00 \ > meta is-managed="true" clone-node-max="1" target-role="Started" > location stonoth_IPMI_msksaphana1_on_msksaphana2 > stonith_IPMI_msksaphana1 -inf: msksaphana1 > location stonoth_IPMI_msksaphana2_on_msksaphana1 > stonith_IPMI_msksaphana2 -inf: msksaphana2 > colocation col_saphana_ip_HDB_HDB00 2000: rsc_ip_HDB_HDB00:Started > msl_SAPHana_HDB_HDB00:Master > order ord_SAPHana_HDB_HDB00 2000: cln_SAPHanaTopology_HDB_HDB00 > msl_SAPHana_HDB_HDB00 > property $id="cib-bootstrap-options" \ > stonith-enabled="true" \ > placement-strategy="balanced" \ > dc-version="1.1.11-3ca8c3b" \ > cluster-infrastructure="classic openais (with plugin)" \ > expected-quorum-votes="2" \ > stonith-action="reboot" \ > no-quorum-policy="ignore" \ > last-lrm-refresh="1411730405" > rsc_defaults $id="rsc-options" \ > resource-stickiness="1" \ > migration-threshold="3" > op_defaults $id="op-options" \ > timeout="600" \ > record-pending="true" > > Thank you! > > -andrei > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org -- esta es mi vida e me la vivo hasta que dios quiera ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Master is restarted when other node comes online
I have two node cluster with single master/slave resource (replicated database) using pacemaker+openais on SLES11 SP3 (pacemaker 1.1.11-3ca8c3b). I hit weird situation that I did not see before, and I cannot really understand it. Assuming master runs on node A and slave runs on node B. If I stop cluster stack on B (rcopenais stop) and start it again (rcopenais start) master is restarted. Of course this means service interruption. Same happens if I reboot node B. I have crm_report and can provide logs which are required but I wanted first to quickly make sure this is not expected behavior. I have not seen it before, but now when I recall what I tested, it was always simulation of node failure. I did not really tried above scenario. Assuming this is correct behavior - what is the correct procedure to shutdown single node then? It makes it impossible to do any maintenance on slave node. Configuration below: node msksaphana1 \ attributes hana_hdb_vhost="msksaphana1" hana_hdb_site="SITE1" hana_hdb_remoteHost="msksaphana2" hana_hdb_srmode="sync" lpa_hdb_lpt="1411732740" node msksaphana2 \ attributes hana_hdb_vhost="msksaphana2" hana_hdb_site="SITE2" hana_hdb_srmode="sync" hana_hdb_remoteHost="msksaphana1" lpa_hdb_lpt="30" primitive rsc_SAPHanaTopology_HDB_HDB00 ocf:suse:SAPHanaTopology \ params SID="HDB" InstanceNumber="00" \ op monitor interval="10" timeout="600" \ op start interval="0" timeout="600" \ op stop interval="0" timeout="300" primitive rsc_SAPHana_HDB_HDB00 ocf:suse:SAPHana \ params SID="HDB" InstanceNumber="00" PREFER_SITE_TAKEOVER="true" AUTOMATED_REGISTER="true" DUPLICATE_PRIMARY_TIMEOUT="7200" \ op start timeout="3600" interval="0" \ op stop timeout="3600" interval="0" \ op promote timeout="3600" interval="0" \ op monitor timeout="700" role="Master" interval="60" \ op monitor timeout="700" role="Slave" interval="61" primitive rsc_ip_HDB_HDB00 ocf:heartbeat:IPaddr2 \ params ip="10.72.10.64" \ op start timeout="20" interval="0" \ op stop timeout="20" interval="0" \ op monitor interval="10" timeout="20" primitive stonith_IPMI_msksaphana1 stonith:external/ipmi \ params ipmitool="/usr/bin/ipmitool" hostname="msksaphana1" passwd="P@ssw0rd" userid="hacluster" ipaddr="10.72.5.47" \ op stop timeout="15" interval="0" \ op monitor timeout="20" interval="3600" \ op start timeout="20" interval="0" \ meta target-role="Started" primitive stonith_IPMI_msksaphana2 stonith:external/ipmi \ params ipmitool="/usr/bin/ipmitool" hostname="msksaphana2" passwd="P@ssw0rd" userid="hacluster" ipaddr="10.72.5.48" \ op stop timeout="15" interval="0" \ op monitor timeout="20" interval="3600" \ op start timeout="20" interval="0" \ meta target-role="Started" ms msl_SAPHana_HDB_HDB00 rsc_SAPHana_HDB_HDB00 \ meta clone-max="2" clone-node-max="1" target-role="Started" clone cln_SAPHanaTopology_HDB_HDB00 rsc_SAPHanaTopology_HDB_HDB00 \ meta is-managed="true" clone-node-max="1" target-role="Started" location stonoth_IPMI_msksaphana1_on_msksaphana2 stonith_IPMI_msksaphana1 -inf: msksaphana1 location stonoth_IPMI_msksaphana2_on_msksaphana1 stonith_IPMI_msksaphana2 -inf: msksaphana2 colocation col_saphana_ip_HDB_HDB00 2000: rsc_ip_HDB_HDB00:Started msl_SAPHana_HDB_HDB00:Master order ord_SAPHana_HDB_HDB00 2000: cln_SAPHanaTopology_HDB_HDB00 msl_SAPHana_HDB_HDB00 property $id="cib-bootstrap-options" \ stonith-enabled="true" \ placement-strategy="balanced" \ dc-version="1.1.11-3ca8c3b" \ cluster-infrastructure="classic openais (with plugin)" \ expected-quorum-votes="2" \ stonith-action="reboot" \ no-quorum-policy="ignore" \ last-lrm-refresh="1411730405" rsc_defaults $id="rsc-options" \ resource-stickiness="1" \ migration-threshold="3" op_defaults $id="op-options" \ timeout="600" \ record-pending="true" Thank you! -andrei ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org