Re: [Pacemaker] Weired resource-stickiness behavior

Andrew Beekhof Mon, 17 Jun 2013 20:49:50 -0700

On 14/06/2013, at 3:52 PM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote:


> Hi, Andrew:
> If I cut down the network connection of the running node by:
> service network stop, 
> "crm status" will show me the node is put into "OFFLINE" status. The affected 
> resource can also be failed over to another online node correctly. But the 
> issue is that, when I re-connect the network  by:
> service network start.
> to put the "OFFLINE" node to be "Online" again, all the resource is firstly 
> stopped , then some resource are restarted again on the original online node 
> and some other resource are going back to the newly "Online" node. This 
> behavior seems not related to the resource-stickiness configuration.
> I'm just curious if it's the default behavior.

It is when you've disabled fencing and the service is still running on the 
"OFFLINE" node.
http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/ch13.html#_what_is_stonith

> And if I tried to reboot the OFFLINE node, when it's online again, the 
> resource won't be stopped.
> Is this expected that "service network start" triggers Pacemaker to reassign 
> resource?
> Thanks.
> 
> 
> 
> On Fri, Jun 14, 2013 at 10:06 AM, Andrew Beekhof <and...@beekhof.net> wrote:
> 
> On 13/06/2013, at 5:15 PM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote:
> 
> > Thanks Andrew.
> > Yes, the fs_ssn service (ocf:FileSystem) is still running when the machine 
> > loses network. I configure it as primitive:
> > primitive fs_ssn ocf:heartbeat:Filesystem \
> >      op monitor interval="15s" \
> >      params device="/dev/drbd0" directory="/drbd" fstype="ext3" \
> >      meta target-role="Started"
> > As I assume this resource can only be started on 1 node, I think it should 
> > be stopped automatically when pacemaker detects it's not in a HA cluster.
> > Is this incorrect assumption?
> 
> No. But I'd need to see logs from all the nodes (please use attachments) to 
> be able to comment further.
> 
> > Thanks.
> >
> >
> >
> > On Thu, Jun 13, 2013 at 1:50 PM, Andrew Beekhof <and...@beekhof.net> wrote:
> >
> > On 13/06/2013, at 2:43 PM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote:
> >
> > > Andrew Beekhof <andrew@...> writes:
> > >
> > >>
> > >> Try increasing your stickiness as it is being exceeded by the location
> > > constraints.
> > >> For the biggest stick, try 'infinity' which means - never move unless the
> > > node dies.
> > >>
> > >
> > > Thanks, Andrew, I applied infinity resource stickiness. However, the sst
> > > resource is still switched to the node which is online back from failure.
> > > And I found sth in the log:
> > >
> > > Jun 13 11:46:29 node3 pengine[27813]:  warning: unpack_rsc_op: Processing
> > > failed op monitor for ip_ssn on node2: not running (7)
> > > Jun 13 11:46:29 node3 pengine[27813]:    error: native_create_actions:
> > > Resource fs_ssn (ocf::Filesystem) is active on 2 nodes attempting recovery
> > > Jun 13 11:46:29 node3 pengine[27813]:  warning: native_create_actions: See
> > > http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more 
> > > information.
> > >
> > > Is this log showing that pacemaker tries to restart all the resource when
> > > the failed node is back again?
> >
> > No, thats a log showing the services were already running there when 
> > pacemaker started.
> >
> > >
> > >
> > >>> Thanks.
> > >>>
> > >>> Below is my configure:
> > >>> ------------------CONFIG START--------------------------------------
> > >>> node node3 \
> > >>>     attributes standby="on"
> > >>> node node1
> > >>> node node2
> > >>> primitive drbd_ssn ocf:linbit:drbd \
> > >>>     params drbd_resource="r0" \
> > >>>     op monitor interval="15s"
> > >>> primitive fs_ssn ocf:heartbeat:Filesystem \
> > >>>     op monitor interval="15s" \
> > >>>     params device="/dev/drbd0" directory="/drbd" fstype="ext3" \
> > >>>     meta target-role="Started"
> > >>> primitive ip_ssn ocf:heartbeat:IPaddr2 \
> > >>>     params ip="192.168.241.1" cidr_netmask="32" \
> > >>>     op monitor interval="15s" \
> > >>>     meta target-role="Started"
> > >>> primitive ip_sst ocf:heartbeat:IPaddr2 \
> > >>>     params ip="192.168.241.2" cidr_netmask="32" \
> > >>>     op monitor interval="15s" \
> > >>>     meta target-role="Started"
> > >>> primitive sst lsb:sst \
> > >>>     op monitor interval="15s" \
> > >>>     meta target-role="stopped"
> > >>> primitive ssn lsb:ssn \
> > >>>     op monitor interval="15s" \
> > >>>     meta target-role="stopped"
> > >>> ms ms_drbd_ssn drbd_ssn \
> > >>>     meta master-max="1" master-node-max="1" clone-max="2" 
> > >>> clone-node-max="1"
> > >>> notify="true" target-role="Started"
> > >>> location sst_ip_prefer ip_sst 50: node1
> > >>> location drbd_ssn_prefer ms_drbd_ssn 50: node1
> > >>> colocation fs_ssn_coloc inf: ip_ssn fs_ssn
> > >>> colocation fs_on_drbd_coloc inf: fs_ssn ms_drbd_ssn:Master
> > >>> colocation sst_ip_coloc inf: sst ip_sst
> > >>> colocation ssn_ip_coloc inf: ssn ip_ssn
> > >>> order ssn_after_drbd inf: ms_drbd_ssn:promote fs_ssn:start
> > >>> order ip_after_fs inf: fs_ssn:start ip_ssn:start
> > >>> order sst_after_ip inf: ip_sst:start sst:start
> > >>> order sst_after_ssn inf: ssn:start sst:start
> > >>> order ssn_after_ip inf: ip_ssn:start ssn:start
> > >>> property $id="cib-bootstrap-options" \
> > >>>     dc-version="1.1.8-7.el6-394e906" \
> > >>>     cluster-infrastructure="classic openais (with plugin)" \
> > >>>     expected-quorum-votes="3" \
> > >>>     stonith-enabled="false"
> > >>> rsc_defaults $id="rsc-options" \
> > >>>     resource-stickiness="100"
> > >>>
> > >>> -------------------CONFIG END----------------------------------------
> > >>>
> > > Best Regards.
> > > Xiaomin
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > >
> > > Project Home: http://www.clusterlabs.org
> > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: http://bugs.clusterlabs.org
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> 
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Weired resource-stickiness behavior

Reply via email to