Re: [ClusterLabs] Corosync+Pacemaker error during failover

priyanka Fri, 15 Jan 2016 03:35:02 -0800

On 2015-10-08 21:20, Ken Gaillot wrote:

On 10/08/2015 10:16 AM, priyanka wrote:
Hi,
We are trying to build a HA setup for our servers using DRBD +Corosync
+ pacemaker stack.

Attached is the configuration file for corosync/pacemaker and drbd.
A few things I noticed:
* Don't set become-primary-on in the DRBD configuration in aPacemaker
cluster; Pacemaker should handle all promotions to primary.

* I'm no NFS expert, but why is res_exportfs_root cloned? Can both
servers export it at the same time? I would expect it to be in thegroup
before res_exportfs_export1.


We have followed following configuration guide for our setup,

https://www.suse.com/documentation/sle_ha/singlehtml/book_sleha_techguides/book_sleha_techguides.html

which suggests to create clone of this resource. This resource will notexport actual data, data is exported by exportfs_export1 resource in oursetup. I did try the previous fail-over scenario without cloning thisresource but same error appeared.

* Your constraints need some adjustment. Partly it depends on theanswer

to the previous question, but currently res_fs (via the group) is
ordered after res_exportfs_root, and I don't see how that could work.

We are getting errors while testing this setup.
1. When we stop corosync on Master machine say server1(lock), it is
Stonith'ed. In this case slave-server2(sher) is promoted to master.
   But when server1(lock) reboots res_exportfs_export1 is started on

both the servers and that resource goes into failed state followedby

servers going into unclean state.

Then server1(lock) reboots and server2(sher) is master but inunclean

state. After server1(lock) comes up, server2(sher) is stonith'ed and
server1(lock) is slave(the only online node).

When server2(sher) comes up, both the servers are slaves andresource

group(rg_export) is stopped. Then server2(sher) becomes Master and
server1(lock) is slave and resource group is started.
   At this point configuration becomes stable.

PFA logs(syslog) of server2(sher) after it is promoted to mastertill it

is first rebooted when resource exportfs goes into failed state.

Please let us know if the configuration is appropriate. From thelogs we

could not figure out exact reason of resource failure.
Your comment on this scenario will be very helpful.

Thanks,
Priyanka



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org

Getting started:http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

Bugs: http://bugs.clusterlabs.org



_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org

Getting started:http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

Bugs: http://bugs.clusterlabs.org


--
Regards,
Priyanka
MTech3 Sysad
IIT Powai

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Corosync+Pacemaker error during failover

Reply via email to