[Pacemaker] How to serialize/control resource startup on Standby node
Hello, I have cluster with 2 nodes with multiple Master/slave resources. The ordering of resources on the master node is achieved using order option of crm. When standby node started, the processes are started one after the another. Following is the configuration info: p*rimitive ClusterIP ocf:mcg:MCG_VIPaddr_RA \ params ip="192.168.113.67" cidr_netmask="255.255.255.0" nic="eth0:1" \ op monitor interval="40" timeout="20" primitive Rmgr ocf:mcg:RM_RA \ op monitor interval="60" role="Master" timeout="30" on-fail="restart" \ op monitor interval="40" role="Slave" timeout="40" on-fail="restart" primitive Tmgr ocf:mcg:TM_RA \ op monitor interval="60" role="Master" timeout="30" on-fail="restart" \ op monitor interval="40" role="Slave" timeout="40" on-fail="restart" primitive pimd ocf:mcg:PIMD_RA \ op monitor interval="60" role="Master" timeout="30" on-fail="restart" \ op monitor interval="40" role="Slave" timeout="40" on-fail="restart" ms ms_Rmgr Rmgr \ meta master-max="1" master-max-node="1" clone-max="2" clone-node-max="1" notify="true" ms ms_Tmgr Tmgr \ meta master-max="1" master-max-node="1" clone-max="2" clone-node-max="1" notify="true" ms ms_pimd pimd \ meta master-max="1" master-max-node="1" clone-max="2" clone-node-max="1" notify="true" colocation ip_with_Rmgr inf: ClusterIP ms_Rmgr:Master colocation ip_with_Tmgr inf: ClusterIP ms_Tmgr:Master colocation ip_with_pimd inf: ClusterIP ms_pimd:Master order TM-after-RM inf: ms_Rmgr:promote ms_Tmgr:start order ip-after-pimd inf: ms_pimd:promote ClusterIP:start order pimd-after-TM inf: ms_Tmgr:promote ms_pimd:start property $id="cib-bootstrap-options" \ dc-version="1.0.11-**db98485d06ed3fe0fe236509f023e1**bd4a5566f1" \ cluster-infrastructure="**Heartbeat" \ no-quorum-policy="ignore" \ stonith-enabled="false" rsc_defaults $id="rsc-options" \ migration_threshold="3" \ resource-stickiness="100" *I have a system requirement in which start of resource (e.g. pimd) is dependent on successful start of another resource (e.g. Tmgr) Everything run smoothly on the master node. This is due to *ordering and few seconds delay* untill a resource is promoted as Master. But on the standby node since the resources are started one after the another without any delay , Standby node in the cluster behaves erratically Is there a way, through which I can serialize/control resource start up on the standby node. Thanks and regards Neha Chatrath ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Remote CRM shell from LCMC
On Wed, Dec 28, 2011 at 12:57:33AM +0100, Rasto Levrinc wrote: > Hi, > > this being a slow news day, There is this great new feature in LCMC, but > probably completely useless. :) The LCMC used to show for testing purposes > the CRM shell configuration, but people started to use it, so I left it > there, made it now editable and added a commit button, that commits the > changes. You can see it as a hole in the bottom of the car, if you are stuck > you can still power the car by your feet. > > There are also some unexpected advantages over "crm configure edit", see > the video. > > http://youtu.be/X75wzUTRmjU?hd=1 Nice. Sound is missing for me from 3:00 onwards. Just in case that was not intentional... Lars ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] How should I configure the STONITH?
Hello, On 12/21/2011 03:34 AM, Qiu Zhigang wrote: > Hi > > >> -Original Message- >> From: Andreas Kurz [mailto:andr...@hastexo.com] >> Sent: Wednesday, December 21, 2011 6:53 AM >> To: pacemaker@oss.clusterlabs.org >> Subject: Re: [Pacemaker] How should I configure the STONITH? >> >> Hello, >> >> On 12/20/2011 05:01 AM, Qiu Zhigang wrote: >>> Hi, >>> -Original Message- From: Andreas Kurz [mailto:andr...@hastexo.com] Sent: Monday, December 19, 2011 6:50 PM To: pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] How should I configure the STONITH? Hello, On 12/19/2011 08:31 AM, Qiu Zhigang wrote: > Hi all, > > I want to configure the STONITH, but I couldn't find the following > CLI described in the reference material (like > Pacemaker-1.1-Pacemaker_Explained-en-US.pdf). > > stonith -L > stonith -t ibmhmc -n > stonith -t apcmaster -h > > I only found the stonith_admin command, but when I execute this cli, > some problem occurred. > > [root@h10_151 ~]# stonith_admin -L > stonith_admin[14402]: 2011/12/19_15:29:20 info: crm_log_init_worker: > Changed active directory to /var/lib/heartbeat/cores/root > stonith_admin[14402]: 2011/12/19_15:29:20 notice: log_data_element: > st_callback: st_notify_disconnect subt="st_notify_disconnect" /> > > What's the reason of this problem ? > > Moreover I didn't find the stonith plugins, which rpm package > contain the stonith plugins ? I use the redhat 6, pacemaker version > is pacemaker-1.1.2-7.el6.x86_64.rpm. RHEL6 and derivates ship "fence-agents" package known from it's redaht-cluster and not the stonith agents ... the agents are all named "fence_*' and come with nice man-pages. You can use them (nearly) like the stonith agents that used to come with Pacemaker as prerequisite. >>> >>> Thank you, but I'm doubt the way of configure fence-agents in pacemaker. >>> Could I configure like following? If this isn't right, please point me >>> the right way, thank you. >>> >>> configure >>> primitive st-apc stonith: fence_apc \ >>> params ip="xx.xx.xx.xx" username="admin" password="password" >>> clone fencing st-apc >>> commit >> >> try something like this for the primitive: >> >> primitive st-apc stonith:fence_apc \ >> params ipaddr="xx.xx.xx.xx" login="admin" passwd="password" \ >> action="reboot" port="dummy" pcmk_host_list="node1 node2" \ >> pcmk_host_map="node1:1,node2:2" pcmk_host_check="static-list" >> >> "man stonithd" and "man fence_apc" should also be helpful >> > Thank you, I'll try. > >>> >>> Another question is about Quorum device, which component is the quorum >>> device in pacemaker, like qdisk in CMAN, is SBD? >> >> corosync or heatbeat CCM do this job for Pacemaker ... deliver node > availability >> information so Pacemaker can calculate quorum based on node count. >> > But how to process the split-brain with 2-nodes ? In cman we could use qdisk > to > arbitrate the quorum node, what about the corosync+pacemaker? You have to configure stonith and ignore quorum ... if you really run into a split-brain, one node will be faster in fencing the other node ... see http://ourobengr.com/ha for some hints to not run into a reboot cycle. You could use redundant rings for corosync communication to lower risk of split-brain and/or add e.g. an extra "quorum" node. Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > >> Regards, >> Andreas >> >> -- >> Need help with Pacemaker? >> http://www.hastexo.com/now >> >>> >>> >>> Best Regards, >>> Qiu Zhigang Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > > Best Regards, > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org >>> >>> >>> >>> ___ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org Getting started: >>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org signature.asc Description: OpenPGP digital signature ___ Pacemaker mailing
Re: [Pacemaker] CMAN and Pacemaker
Hello, On 12/24/2011 09:13 AM, Fil wrote: > Hi everyone, > > Happy holidays! > > I need some help with adding CMAN to my current cluster config. > Currently I have a two node Corosync/Pacemaker (Active/Passive) cluster. > It works as expected. Now I need to add a distributed filesystem to my > setup. I would like to test GFS2. As much as I understand I need to > setup CMAN to manage dlm/gfs_controld, am I correct? I have followed the > Clusters_from_Scratch document but I am having issues starting > pacemakerd once the cman is up and running. Is it possible to use > dlm/gfs_controld without cman, directly from pacemaker? How do I strat > pacemaker when CMAN is running, and do I even need to, and if not how do > I manage my resources? Currently I am using: > > Fedora 16 > corosync-1.4.2-1.fc16.x86_64 > pacemaker-1.1.6-4.fc16.x86_64 > cman-3.1.7-1.fc16.x86_64 Only start cman service -- not corosync -- and then start pacemaker service, that should be enough. What is the error you get when starting pacemaker via its init script? Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > Thanks > filip > > cluster.conf > - > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > corosync.conf > -- > compatibility: whitetank > > totem { > version: 2 > secauth: off > threads: 0 > rrp_mode: passive > > interface { > ringnumber: 0 > bindnetaddr: 192.168.10.0 > mcastaddr: 226.94.1.1 > mcastport: 5405 > } > } > > logging { > fileline: off > to_stderr: no > to_logfile: yes > to_syslog: yes > logfile: /var/log/cluster/corosync.log > debug: off > timestamp: on > } > > amf { > mode: disabled > } > > pacemaker conf > -- > node server01 \ > attributes standby="off" > node server02 \ > attributes standby="off" > primitive scsi_reservation ocf:adriatic:sg_persist \ > params sg_persist_resource="scsi_reservation0" > devs="/dev/disk/by-path/ip-192.168.10.5:3260-iscsi-iqn.2004-04.com.qnap:ts-459proii:iscsi.test.cb4d16-lun-0" > required_devs_nof="1" reservation_type="1" \ > op start interval="0" timeout="30s" \ > op stop interval="0" timeout="30s" > primitive vm_test ocf:adriatic:VirtualDomain \ > params config="/etc/libvirt/qemu/test.xml" hypervisor="qemu:///system" > migration_transport="tcp" \ > meta allow-migrate="true" is-managed="true" target-role="Stopped" \ > op start interval="0" timeout="120s" \ > op stop interval="0" timeout="120s" \ > op migrate_from interval="0" timeout="120s" \ > op migrate_to interval="0" timeout="120s" \ > op monitor interval="10" timeout="30" depth="0" \ > utilization cpu="1" hv_memory="1024" > ms ms_scsi_reservation scsi_reservation \ > meta master-max="1" master-node-max="1" clone-max="2" > clone-node-max="1" notify="true" migration-threshold="1" > allow-migrate="true" globally-unique="false" target-role="Stopped" > location cli-prefer-vm_test vm_test \ > rule $id="cli-prefer-rule-vm_test" inf: #uname eq server02 > colocation service_on_scsi_reservation inf: vm_test > ms_scsi_reservation:Master > order service_after_scsi_reservation inf: ms_scsi_reservation:promote > vm_test:start > property $id="cib-bootstrap-options" \ > dc-version="1.1.6-4.fc16-89678d4947c5bd466e2f31acd58ea4e1edb854d5" \ > cluster-infrastructure="openais" \ > expected-quorum-votes="2" \ > stonith-enabled="false" \ > no-quorum-policy="ignore" \ > default-resource-stickiness="0" \ > last-lrm-refresh="1324069959" > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org signature.asc Description: OpenPGP digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Re: [Pacemaker] How can a node join another cluster safely??
On 12/26/2011 01:17 PM, Mars gu wrote: > Hi All?? > I have two clusters??cluster_A and cluster_B. > A node named node_b was in cluster_B before. > > And now I want to move the node_b to Cluster_A. Then I changed the > configfile(corosync.conf) of node_b into the configfile of Cluster_A's node. > when I restart corosync. node_a joined cluster_A Successfully, but I > noticed that the cib.xml in cluster_A changed just like Cluster_B's. > Then the resources I defined in cluster_A gone. > How can the node_b join cluster_a safely? > I find that when i execute this cli before the service started, > Cluster_A's configfle will not be chaneged. > > > cibadmin --modify --crm_xml '' > > > Is this the right way to solve the problem? > thanks. completely clean the "/var/run/heartbeat/crm" directory on that node, before you put it into an other cluster. Regards, Andreas -- Need help with Pacemaker? http://www.hastexo.com/now > > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org signature.asc Description: OpenPGP digital signature ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Patch: use NFSv4 with RA nfsserver
On Tue, Dec 27, 2011 at 3:30 PM, Vogt Josef wrote: >> Just a question here: I could't get it to work without setting the gracetime >> - which isn't set in the exportfs RA. Are you sure this works as expected? > Thanks, good input. I'd be happy to add that (as in, > wait_for_gracetime_on_start or similar). However, can you do me a favor > please? Take a look at the discussion archived at > http://www.spinics.net/lists/linux-nfs/msg22670.html and let me know if > nlm_grace_period (as mentioned in http://www.spinics.net/lists/linux-nfs/msg22737.html) made any difference? Yes, that did the trick. /proc/sys/fs/nfs/nlm_grace_period is not needed for NFSv4 so I just set it to a very small value. The problem is this: When using boht, NFSv2/NFSv3 and NFSv4, in case of a failover a v4 client could get a new lock on a locked file by a v2/v3 client before the v2/v3 client has a chance to reclaim its lock... Thats why we have to wait the whole 90 seconds (nlm_grace_period). The other two values (leasetime and gracetime) need to be small (which means higher load but faster failover). As far as I understand it, you would have to wait just until the bigger of /proc/fs/nfsd/nfsv4leastime and /proc/fs/nfsd/nfsv4gracetime is reached. Note: when you set this values manually, they are gone after you boot the machine. That's why I made sure in a case of failover, these values are set properly. I guess it would not be so easy to use NFSv4 and NFSv2/NFSv3 at the same time. On the other hand: Setting all this 3 values mentioned above to 10 should be save. I found this thread very helpful: http://marc.info/?l=linux-nfs&m=131590701830261&w=2 Kind regards Josef ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org