On 14/06/2013, at 3:52 PM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote:
> Hi, Andrew: > If I cut down the network connection of the running node by: > service network stop, > "crm status" will show me the node is put into "OFFLINE" status. The affected > resource can also be failed over to another online node correctly. But the > issue is that, when I re-connect the network by: > service network start. > to put the "OFFLINE" node to be "Online" again, all the resource is firstly > stopped , then some resource are restarted again on the original online node > and some other resource are going back to the newly "Online" node. This > behavior seems not related to the resource-stickiness configuration. > I'm just curious if it's the default behavior. It is when you've disabled fencing and the service is still running on the "OFFLINE" node. http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html/Pacemaker_Explained/ch13.html#_what_is_stonith > And if I tried to reboot the OFFLINE node, when it's online again, the > resource won't be stopped. > Is this expected that "service network start" triggers Pacemaker to reassign > resource? > Thanks. > > > > On Fri, Jun 14, 2013 at 10:06 AM, Andrew Beekhof <and...@beekhof.net> wrote: > > On 13/06/2013, at 5:15 PM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote: > > > Thanks Andrew. > > Yes, the fs_ssn service (ocf:FileSystem) is still running when the machine > > loses network. I configure it as primitive: > > primitive fs_ssn ocf:heartbeat:Filesystem \ > > op monitor interval="15s" \ > > params device="/dev/drbd0" directory="/drbd" fstype="ext3" \ > > meta target-role="Started" > > As I assume this resource can only be started on 1 node, I think it should > > be stopped automatically when pacemaker detects it's not in a HA cluster. > > Is this incorrect assumption? > > No. But I'd need to see logs from all the nodes (please use attachments) to > be able to comment further. > > > Thanks. > > > > > > > > On Thu, Jun 13, 2013 at 1:50 PM, Andrew Beekhof <and...@beekhof.net> wrote: > > > > On 13/06/2013, at 2:43 PM, Xiaomin Zhang <zhangxiao...@gmail.com> wrote: > > > > > Andrew Beekhof <andrew@...> writes: > > > > > >> > > >> Try increasing your stickiness as it is being exceeded by the location > > > constraints. > > >> For the biggest stick, try 'infinity' which means - never move unless the > > > node dies. > > >> > > > > > > Thanks, Andrew, I applied infinity resource stickiness. However, the sst > > > resource is still switched to the node which is online back from failure. > > > And I found sth in the log: > > > > > > Jun 13 11:46:29 node3 pengine[27813]: warning: unpack_rsc_op: Processing > > > failed op monitor for ip_ssn on node2: not running (7) > > > Jun 13 11:46:29 node3 pengine[27813]: error: native_create_actions: > > > Resource fs_ssn (ocf::Filesystem) is active on 2 nodes attempting recovery > > > Jun 13 11:46:29 node3 pengine[27813]: warning: native_create_actions: See > > > http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more > > > information. > > > > > > Is this log showing that pacemaker tries to restart all the resource when > > > the failed node is back again? > > > > No, thats a log showing the services were already running there when > > pacemaker started. > > > > > > > > > > >>> Thanks. > > >>> > > >>> Below is my configure: > > >>> ------------------CONFIG START-------------------------------------- > > >>> node node3 \ > > >>> attributes standby="on" > > >>> node node1 > > >>> node node2 > > >>> primitive drbd_ssn ocf:linbit:drbd \ > > >>> params drbd_resource="r0" \ > > >>> op monitor interval="15s" > > >>> primitive fs_ssn ocf:heartbeat:Filesystem \ > > >>> op monitor interval="15s" \ > > >>> params device="/dev/drbd0" directory="/drbd" fstype="ext3" \ > > >>> meta target-role="Started" > > >>> primitive ip_ssn ocf:heartbeat:IPaddr2 \ > > >>> params ip="192.168.241.1" cidr_netmask="32" \ > > >>> op monitor interval="15s" \ > > >>> meta target-role="Started" > > >>> primitive ip_sst ocf:heartbeat:IPaddr2 \ > > >>> params ip="192.168.241.2" cidr_netmask="32" \ > > >>> op monitor interval="15s" \ > > >>> meta target-role="Started" > > >>> primitive sst lsb:sst \ > > >>> op monitor interval="15s" \ > > >>> meta target-role="stopped" > > >>> primitive ssn lsb:ssn \ > > >>> op monitor interval="15s" \ > > >>> meta target-role="stopped" > > >>> ms ms_drbd_ssn drbd_ssn \ > > >>> meta master-max="1" master-node-max="1" clone-max="2" > > >>> clone-node-max="1" > > >>> notify="true" target-role="Started" > > >>> location sst_ip_prefer ip_sst 50: node1 > > >>> location drbd_ssn_prefer ms_drbd_ssn 50: node1 > > >>> colocation fs_ssn_coloc inf: ip_ssn fs_ssn > > >>> colocation fs_on_drbd_coloc inf: fs_ssn ms_drbd_ssn:Master > > >>> colocation sst_ip_coloc inf: sst ip_sst > > >>> colocation ssn_ip_coloc inf: ssn ip_ssn > > >>> order ssn_after_drbd inf: ms_drbd_ssn:promote fs_ssn:start > > >>> order ip_after_fs inf: fs_ssn:start ip_ssn:start > > >>> order sst_after_ip inf: ip_sst:start sst:start > > >>> order sst_after_ssn inf: ssn:start sst:start > > >>> order ssn_after_ip inf: ip_ssn:start ssn:start > > >>> property $id="cib-bootstrap-options" \ > > >>> dc-version="1.1.8-7.el6-394e906" \ > > >>> cluster-infrastructure="classic openais (with plugin)" \ > > >>> expected-quorum-votes="3" \ > > >>> stonith-enabled="false" > > >>> rsc_defaults $id="rsc-options" \ > > >>> resource-stickiness="100" > > >>> > > >>> -------------------CONFIG END---------------------------------------- > > >>> > > > Best Regards. > > > Xiaomin > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > Project Home: http://www.clusterlabs.org > > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > Bugs: http://bugs.clusterlabs.org > > > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org