Re: [Pacemaker] fencing question
On 14 Mar 2014, at 1:18 am, Karl Rößmann wrote: > Hi, > > I changed the running resource by > crm / configure / edit / commit. It seemed to work. > > I stopped the resource, and changed some details, > Whenever I commit again I get this warning: > warning: do_log: FSA: Input I_ELECTION_DC from do_election_check() received > in state S_INTEGRATION > > see below > > Mar 13 15:02:04 ha1infra crm_verify[24991]: notice: crm_log_args: Invoked: > crm_verify -V -p > Mar 13 15:02:04 ha1infra cibadmin[24992]: notice: crm_log_args: Invoked: > cibadmin -p -R > Mar 13 15:02:04 ha1infra crmd[21812]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL > origin=abort_transition_graph ] > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: Diff: --- 0.1057.3 > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: Diff: +++ 0.1058.1 > a460a945dcf52bbb4ffb39e7963ee925 > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: -- admin_epoch="0" epoch="1057" num_updates="3"/> > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++id="vmdv03" class="ocf" provider="heartbeat" type="Xen"> > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ > > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++name="target-role" value="Stopped" id="vmdv03-meta_attributes-target-role"/> > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++name="allow-migrate" value="true" id="vmdv03-meta_attributes-allow-migrate"/> > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ > > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ > > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++name="monitor" interval="10" timeout="30" id="vmdv03-monitor-10"/> > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++name="migrate_from" interval="0" timeout="600" id="vmdv03-migrate_from-0"/> > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++name="migrate_to" interval="0" timeout="600" id="vmdv03-migrate_to-0"/> > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ > > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ > > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++name="xmfile" value="/etc/xen/vm/vmdv03" > id="vmdv03-instance_attributes-xmfile"/> > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++name="shutdown_timeout" value="120" > id="vmdv03-instance_attributes-shutdown_timeout"/> > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ > > Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ > Mar 13 15:02:04 ha1infra crmd[21812]: notice: do_state_transition: State > transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC > cause=C_TIMER_POPPED origin=election_timeout_popped ] > Mar 13 15:02:04 ha1infra crmd[21812]: warning: do_log: FSA: Input > I_ELECTION_DC from do_election_check() received in state S_INTEGRATION > <-- what does this mean ? It means that something not completely normal is going on. Possibly the nodes can't talk to each other, but I'm betting on a bug of some kind. > Mar 13 15:02:04 ha1infra crmd[21812]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL > origin=abort_transition_graph ] > Mar 13 15:02:04 ha1infra crmd[21812]: notice: do_state_transition: State > transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC > cause=C_TIMER_POPPED origin=election_timeout_popped ] There's not enough time for a timer to have really expired. Probably a good idea to contact SUSE support (and configure a log file, it will contain more information than syslog). > Mar 13 15:02:04 ha1infra attrd[21810]: notice: attrd_local_callback: > Sending full refresh (origin=crmd) > Mar 13 15:02:04 ha1infra attrd[21810]: notice: attrd_trigger_update: > Sending flush op to all hosts for: shutdown (0) > Mar 13 15:02:04 ha1infra crmd[21812]: notice: crm_update_quorum: Updating > quorum status to true (call=457) > Mar 13 15:02:04 ha1infra attrd[21810]: notice: attrd_trigger_update: > Sending flush op to all hosts for: probe_complete (true) > > > > Karl > > >> On 2014-03-12T16:16:54, Karl Rößmann wrote: >> >>> >>primitive fkflmw ocf:heartbeat:Xen \ >>> >>meta target-role="Started" is-managed="true" allow-migrate="true" >>> >> \ >>> >>op monitor interval="10" timeout="30" \ >>> >>op migrate_from interval="0" timeout="600" \ >>> >>op migrate_to interval="0" timeout="600" \ >>> >>params xmfile="/etc/xen/vm/fkflmw" shutdown_timeout="120" >>> > >>> >You need to set a >120s timeout for the stop operation too: >>> > op stop timeout="150" >>> > >>> >>default-action-timeout="60s" >>> > >>> >Or set this to, say, 150s. >>> can I do this while the resou
Re: [Pacemaker] fencing question
Hi, I changed the running resource by crm / configure / edit / commit. It seemed to work. I stopped the resource, and changed some details, Whenever I commit again I get this warning: warning: do_log: FSA: Input I_ELECTION_DC from do_election_check() received in state S_INTEGRATION see below Mar 13 15:02:04 ha1infra crm_verify[24991]: notice: crm_log_args: Invoked: crm_verify -V -p Mar 13 15:02:04 ha1infra cibadmin[24992]: notice: crm_log_args: Invoked: cibadmin -p -R Mar 13 15:02:04 ha1infra crmd[21812]: notice: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL origin=abort_transition_graph ] Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: Diff: --- 0.1057.3 Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: Diff: +++ 0.1058.1 a460a945dcf52bbb4ffb39e7963ee925 Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: -- admin_epoch="0" epoch="1057" num_updates="3"/> Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ id="vmdv03-meta_attributes-target-role"/> Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ id="vmdv03-meta_attributes-allow-migrate"/> Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ id="vmdv03-migrate_from-0"/> Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ id="vmdv03-migrate_to-0"/> Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ id="vmdv03-instance_attributes-xmfile"/> Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ id="vmdv03-instance_attributes-shutdown_timeout"/> Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ Mar 13 15:02:04 ha1infra cib[21807]: notice: cib:diff: ++ Mar 13 15:02:04 ha1infra crmd[21812]: notice: do_state_transition: State transition S_ELECTION -> S_INTEGRATION [ input=I_ELECTION_DC cause=C_TIMER_POPPED origin=election_timeout_popped ] Mar 13 15:02:04 ha1infra crmd[21812]: warning: do_log: FSA: Input I_ELECTION_DC from do_election_check() received in state S_INTEGRATION <-- what does this mean ? Mar 13 15:02:04 ha1infra attrd[21810]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Mar 13 15:02:04 ha1infra attrd[21810]: notice: attrd_trigger_update: Sending flush op to all hosts for: shutdown (0) Mar 13 15:02:04 ha1infra crmd[21812]: notice: crm_update_quorum: Updating quorum status to true (call=457) Mar 13 15:02:04 ha1infra attrd[21810]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Karl On 2014-03-12T16:16:54, Karl Rößmann wrote: >>primitive fkflmw ocf:heartbeat:Xen \ >>meta target-role="Started" is-managed="true" allow-migrate="true" \ >>op monitor interval="10" timeout="30" \ >>op migrate_from interval="0" timeout="600" \ >>op migrate_to interval="0" timeout="600" \ >>params xmfile="/etc/xen/vm/fkflmw" shutdown_timeout="120" > >You need to set a >120s timeout for the stop operation too: >op stop timeout="150" > >>default-action-timeout="60s" > >Or set this to, say, 150s. can I do this while the resource (the xen VM) is running ? Yes, changing the stop timeout should not have a negative impact on your resource. You can also check how the cluster would react: # crm configure crm(live)configure# edit (Make all changes you want here) crm(live)configure# simulate actions nograph before you type "commit". Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org -- Karl RößmannTel. +49-711-689-1657 Max-Planck-Institut FKF Fax. +49-711-689-1632 Postfach 800 665 70506 Stuttgart email k.roessm...@fkf.mpg.de ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlab
Re: [Pacemaker] fencing question
On 2014-03-12T16:16:54, Karl Rößmann wrote: > >>primitive fkflmw ocf:heartbeat:Xen \ > >>meta target-role="Started" is-managed="true" allow-migrate="true" \ > >>op monitor interval="10" timeout="30" \ > >>op migrate_from interval="0" timeout="600" \ > >>op migrate_to interval="0" timeout="600" \ > >>params xmfile="/etc/xen/vm/fkflmw" shutdown_timeout="120" > > > >You need to set a >120s timeout for the stop operation too: > > op stop timeout="150" > > > >>default-action-timeout="60s" > > > >Or set this to, say, 150s. > can I do this while the resource (the xen VM) is running ? Yes, changing the stop timeout should not have a negative impact on your resource. You can also check how the cluster would react: # crm configure crm(live)configure# edit (Make all changes you want here) crm(live)configure# simulate actions nograph before you type "commit". Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] fencing question
Hi. primitive fkflmw ocf:heartbeat:Xen \ meta target-role="Started" is-managed="true" allow-migrate="true" \ op monitor interval="10" timeout="30" \ op migrate_from interval="0" timeout="600" \ op migrate_to interval="0" timeout="600" \ params xmfile="/etc/xen/vm/fkflmw" shutdown_timeout="120" You need to set a >120s timeout for the stop operation too: op stop timeout="150" default-action-timeout="60s" Or set this to, say, 150s. can I do this while the resource (the xen VM) is running ? Karl -- Karl RößmannTel. +49-711-689-1657 Max-Planck-Institut FKF Fax. +49-711-689-1632 Postfach 800 665 70506 Stuttgart email k.roessm...@fkf.mpg.de ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] fencing question
On 2014-03-12T15:17:13, Karl Rößmann wrote: > Hi, > > we have a two node HA cluster using SuSE SlES 11 HA Extension SP3, > latest release value. > A resource (xen) was manually stopped, the shutdown_timeout is 120s > but after 60s the node was fenced and shut down by the other node. > > should I change some timeout value ? > > This is a part of our configuration: > ... > primitive fkflmw ocf:heartbeat:Xen \ > meta target-role="Started" is-managed="true" allow-migrate="true" \ > op monitor interval="10" timeout="30" \ > op migrate_from interval="0" timeout="600" \ > op migrate_to interval="0" timeout="600" \ > params xmfile="/etc/xen/vm/fkflmw" shutdown_timeout="120" You need to set a >120s timeout for the stop operation too: op stop timeout="150" > default-action-timeout="60s" Or set this to, say, 150s. Regards, Lars -- Architect Storage/HA SUSE LINUX Products GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 21284 (AG Nürnberg) "Experience is the name everyone gives to their mistakes." -- Oscar Wilde ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org