Re: [Pacemaker] Two node DRBD cluster will not automatically failover to the secondary
Hi Shravan, Thank you very much for your reply. I know it was quite a while ago that I posted my question to the mailing list, but I've been working on other things and have only just had the chance to come back to this. You say that I need to setup stonith resources along with setting stonith-enabled = true. Well I know how to change the stonith-enabled setting, but I have no clue as to how I go about setting up the appropriate stonith resources to prevent DRBD from getting into a split brain situation. The documentation provided on the DRBD website about setting up a 2 node cluster with Pacemaker doesn't tell you to enable stonith or configure stonith resources. It does talk about the resource fencing options within the /etc/drbd.conf of which I have configured: resource r0 { disk { fencing resource-only; } handlers { fence-peer /usr/lib/drbd/crm-fence-peer.sh; after-resync-target /usr/lib/drbd/crm-unfence-peer.sh; } I've searched the internet high and low for example pacemaker configs that show you how to configure stonith resources for DRBD, but I can't find anything useful. This howto ( http://www.howtoforge.com/installation-and-setup-guide-for-drbd-openais-pacemaker-xen-on-opensuse-11.1) that I found spells out how to configure a cluster and even states: STONITH is disabled in this configuration though it is highly-recommended in any production environment to eliminate the risk of divergent data. but infuriatingly it doesn't tell you how. Could you please give me some pointers or some helpful examples or perhaps point me to someone or something that can give me a hand in this area? Many Thanks Tom On Thu, Dec 17, 2009 at 2:14 PM, Shravan Mishra shravan.mis...@gmail.comwrote: Hi, For stateful resources like drbd you will have to setup stonith resources for them to function properly or at all. stonith-enabled is true by default. Sincerely Shravan On Thu, Dec 17, 2009 at 6:29 AM, Tom Pride tom.pr...@gmail.com wrote: Hi there, I have setup a two node DRBD culster with pacemaker using the instructions provided on the drbd.org website: http://www.drbd.org/users-guide-emb/ch-pacemaker.html The cluster works perfectly and I can migrate the resources back and forth between the two nodes without a problem. However, if I try simulating a complete server failure of the master node by powering off the server, pacemaker does not then automatically bring up the remaining node as the master. I need some help to find out what configuration changes I need to make in order for my cluster to failover automatically. The cluster is built on 2 Redhat EL 5.3 servers running the following software versions: drbd-8.3.6-1 pacemaker-1.0.5-4.1 openais-0.80.5-15.1 Below I have listed the drbd.conf, openais.conf and the output of crm configuration show. If someone could take a look at these for me and provide any suggestions/modifications I would be most grateful. Thanks, Tom /etc/drbd.conf global { usage-count no; } common { protocol C; } resource r0 { disk { fencing resource-only; } handlers { fence-peer /usr/lib/drbd/crm-fence-peer. sh; after-resync-target /usr/lib/drbd/crm-unfence-peer.sh; } syncer { rate 40M; } on mq001.back.live.cwwtf.local { device/dev/drbd1; disk /dev/cciss/c0d0p1; address 172.23.8.69:7789; meta-disk internal; } on mq002.back.live.cwwtf.local { device/dev/drbd1; disk /dev/cciss/c0d0p1; address 172.23.8.70:7789; meta-disk internal; } } r...@mq001:~# cat /etc/ais/openais.conf totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 1500 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: on threads: 0 rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 172.59.60.0 mcastaddr: 239.94.1.1 mcastport: 5405 } interface { ringnumber: 1 bindnetaddr: 172.23.8.0 mcastaddr: 239.94.2.1 mcastport: 5405 } } logging { to_stderr: yes debug: on timestamp: on to_file: no to_syslog: yes syslog_facility: daemon } amf { mode: disabled } service { ver: 0 name: pacemaker use_mgmtd: yes } aisexec { user: root group: root } r...@mq001:~# crm configure show node mq001.back.live.cwwtf.local node mq002.back.live.cwwtf.local primitive activemq-emp lsb:bbc-activemq-emp primitive activemq-forge-services lsb:bbc-activemq-forge-services primitive activemq-social lsb:activemq-social primitive drbd_activemq ocf:linbit:drbd \ params drbd_resource=r0 \ op monitor interval=15s primitive fs_activemq ocf:heartbeat:Filesystem \ params device=/dev/drbd1 directory=/drbd fstype=ext3 primitive ip_activemq ocf:heartbeat:IPaddr2 \ params ip=172.23.8.71 nic=eth0 group activemq fs_activemq ip_activemq activemq-forge-services
Re: [Pacemaker] Two node DRBD cluster will not automatically failover to the secondary
Hi, For stateful resources like drbd you will have to setup stonith resources for them to function properly or at all. stonith-enabled is true by default. Sincerely Shravan On Thu, Dec 17, 2009 at 6:29 AM, Tom Pride tom.pr...@gmail.com wrote: Hi there, I have setup a two node DRBD culster with pacemaker using the instructions provided on the drbd.org website: http://www.drbd.org/users-guide-emb/ch-pacemaker.html The cluster works perfectly and I can migrate the resources back and forth between the two nodes without a problem. However, if I try simulating a complete server failure of the master node by powering off the server, pacemaker does not then automatically bring up the remaining node as the master. I need some help to find out what configuration changes I need to make in order for my cluster to failover automatically. The cluster is built on 2 Redhat EL 5.3 servers running the following software versions: drbd-8.3.6-1 pacemaker-1.0.5-4.1 openais-0.80.5-15.1 Below I have listed the drbd.conf, openais.conf and the output of crm configuration show. If someone could take a look at these for me and provide any suggestions/modifications I would be most grateful. Thanks, Tom /etc/drbd.conf global { usage-count no; } common { protocol C; } resource r0 { disk { fencing resource-only; } handlers { fence-peer /usr/lib/drbd/crm-fence-peer. sh; after-resync-target /usr/lib/drbd/crm-unfence-peer.sh; } syncer { rate 40M; } on mq001.back.live.cwwtf.local { device/dev/drbd1; disk /dev/cciss/c0d0p1; address 172.23.8.69:7789; meta-disk internal; } on mq002.back.live.cwwtf.local { device/dev/drbd1; disk /dev/cciss/c0d0p1; address 172.23.8.70:7789; meta-disk internal; } } r...@mq001:~# cat /etc/ais/openais.conf totem { version: 2 token: 3000 token_retransmits_before_loss_const: 10 join: 60 consensus: 1500 vsftype: none max_messages: 20 clear_node_high_bit: yes secauth: on threads: 0 rrp_mode: passive interface { ringnumber: 0 bindnetaddr: 172.59.60.0 mcastaddr: 239.94.1.1 mcastport: 5405 } interface { ringnumber: 1 bindnetaddr: 172.23.8.0 mcastaddr: 239.94.2.1 mcastport: 5405 } } logging { to_stderr: yes debug: on timestamp: on to_file: no to_syslog: yes syslog_facility: daemon } amf { mode: disabled } service { ver: 0 name: pacemaker use_mgmtd: yes } aisexec { user: root group: root } r...@mq001:~# crm configure show node mq001.back.live.cwwtf.local node mq002.back.live.cwwtf.local primitive activemq-emp lsb:bbc-activemq-emp primitive activemq-forge-services lsb:bbc-activemq-forge-services primitive activemq-social lsb:activemq-social primitive drbd_activemq ocf:linbit:drbd \ params drbd_resource=r0 \ op monitor interval=15s primitive fs_activemq ocf:heartbeat:Filesystem \ params device=/dev/drbd1 directory=/drbd fstype=ext3 primitive ip_activemq ocf:heartbeat:IPaddr2 \ params ip=172.23.8.71 nic=eth0 group activemq fs_activemq ip_activemq activemq-forge-services activemq-emp activemq-social ms ms_drbd_activemq drbd_activemq \ meta master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true colocation activemq_on_drbd inf: activemq ms_drbd_activemq:Master order activemq_after_drbd inf: ms_drbd_activemq:promote activemq:start property $id=cib-bootstrap-options \ dc-version=1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7 \ cluster-infrastructure=openais \ expected-quorum-votes=2 \ no-quorum-policy=ignore \ last-lrm-refresh=1260809203 ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] Two node DRBD cluster will not automatically failover to the secondary
Tom Pride wrote: Hi there, I have setup a two node DRBD culster with pacemaker using the instructions provided on the drbd.org http://drbd.org/ website: http://www.drbd.org/users-guide-emb/ch-pacemaker.html The cluster works perfectly and I can migrate the resources back and forth between the two nodes without a problem. However, if I try simulating a complete server failure of the master node by powering off the server, pacemaker does not then automatically bring up the remaining node as the master. I need some help to find out what configuration changes I need to make in order for my cluster to failover automatically. Your config looks OKAY at first glance. To test, try disabling the second interface in openais.conf and run it with only one link, see if that changes behavior. If no luck, log files? -- : Adam Gandelman : LINBIT | Your Way to High Availability : Telephone: 503-573-1262 ext. 203 : Sales: 1-877-4-LINBIT / 1-877-454-6248 : : 7959 SW Cirrus Dr. : Beaverton, OR 97008 : : http://www.linbit.com ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker