Hi there, sorry for getting back that late to the issue but I had to work on somehting else for the past few days.
I did revert both virtual machines again and here's the exact sequence of commands I've use to attempt to get the Pacemaker integrated dual master setup to work: - apt-get install python-software-properties && \ add-apt-repository ppa:ubuntu-ha/lucid-cluster && \ apt-get update - apt-get install pacemaker libdlm3-pacemaker ocfs2-tools drbd8-utils openais - Rebooted - shred -n 1 -v /dev/mapper/sde1_crypt - Created the following configuration file for the DRBD device (/etc/drbd.d/r2.res) resource r2 { handlers { pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo o > /proc/sysreq-trigger ; halt -f"; local-io-error "echo o > /proc/sysrq-trigger ; halt-f"; } startup { degr-wfc-timeout 120; become-primary-on both; } disk { on-io-error detach; } net { cram-hmac-alg sha1; shared-secret "SECRET"; data-integrity-alg sha1; allow-two-primaries; after-sb-0pri disconnect; after-sb-1pri consensus; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { rate 60M; } on janus { device /dev/drbd2; disk /dev/mapper/sde1_crypt; address 10.10.1.2:7882; meta-disk internal; } on mimas { device /dev/drbd2; disk /dev/mapper/sde1_crypt; address 10.10.1.3:7882; meta-disk internal; } } - drbdadm create-md r2 md_offset 26836983808 al_offset 26836951040 bm_offset 26836131840 Found some data ==> This might destroy existing data! <== Do you want to proceed? [need to type 'yes' to confirm] yes Writing meta data... initializing activity log NOT initialized bitmap New drbd meta data block successfully created. success Both nodes - drbdadm create-md r2 - drbdadm attach r2 - drbdadm syncer r2 Second node: - drbdadm -- --discard-my-data connect r2 First node: drbdadm -- --overwrite-data-of-peer primary r2 - drbdadm connect r2 - dpkg-reconfigure ocfs2-tools - update-rc.d o2cb disable - Created the following cib objects primitive resDrbd2 ocf:linbit:drbd \ params drbd_resource="r2" \ operations $id="resDrbd2-operations" \ op monitor interval="20s" role="Master" timeout="20s" \ op monitor interval="30s" role="Slave" timeout="20s" ms msDrbd2 resDrbd2 \ meta resource-stickiness="100" \ master-max="2" master-node-max="1" \ clone-max="2" clone-node-max="1" \ notify="true" globally-unique="false" location locDrbd2AllowedNodes msDrbd2 rule 200: #uname eq node1 or #uname eq node2 location locDrbd2Master msDrbd2 rule role=master inf: #uname eq node1 primitive resDlm ocf:pacemaker:controld \ op monitor interval="120s" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" clone cloneDlm resDlm \ meta globally-unique="false" interleave="true" colocation colDlm-on-msDrb2dMaster inf: cloneDlm msDrbd2:Master order ordDlm-before-msDrbdMaster2 0: msDrbd2:promote cloneDlm location locCloneDlmAllowedNodes cloneDlm rule 200: #uname eq node1 or #uname eq node2 primitive resO2CB ocf:pacemaker:o2cb \ op monitor interval="120s" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" clone cloneO2CB resO2CB \ meta globally-unique="false" interleave="true" colocation colO2CB-on-Dlm inf: cloneO2CB cloneDlm order ordO2CB-after-Dlm 0: cloneDlm cloneO2CB location locCloneO2CBAllowedNodes cloneO2CB rule 100: #uname eq node1 or #uname eq node2 - Rebooted both nodes - After the reboot the Pacemaker services the required Pacemaker service are up and running (Output of crm_mon -1f) Master/Slave Set: msDrbd2 Masters: [ node1 node2 ] Clone Set: cloneDlm Started: [ node1 node2 ] Stopped: [ resDlm:2 ] Clone Set: cloneO2CB Started: [ node1 node2 ] Stopped: [ resO2CB:2 ] - Created the filesystem afterwards using the command: mkfs.ocfs2 -L r2 /dev/drbd2 mkfs.ocfs2 1.4.3 Cluster stack: pcmk Cluster name: pacemaker NOTE: Selecting extended slot map for userspace cluster stack Filesystem label=r2 Block size=4096 (bits=12) Cluster size=4096 (bits=12) Volume size=26836131840 (6551790 clusters) (6551790 blocks) 204 cluster groups (tail covers 3822 clusters, rest cover 32256 clusters) Journal size=167723008 Initial number of node slots: 8 Creating bitmaps: done Initializing superblock: done Writing system files: done Writing superblock: done Writing backup superblock: 3 block(s) Formatting Journals: done Formatting slot map: done Writing lost+found: done mkfs.ocfs2 successful - Created the following CIB objects for the filesystem: primitive resFs2 ocf:heartbeat:Filesystem \ params device="/dev/drbd2" fstype="ocfs2" directory="/var/www" \ op monitor interval="120s" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="100" \ meta target-role="stopped" clone cloneFs2 resFs2 \ meta globally-unique="false" interleave="true" colocation colFs2-on-CloneO2CB inf: cloneFs2 cloneO2CB order ordFs2-after-cloneO2CB inf: cloneO2CB cloneFs2 location locCloneFs0AllowedNodes cloneFs2 rule 100: #uname eq node1 or #uname eq node2 - and started the file system by executing the command crm resource start cloneFs2 The file system comes up fine on the first node (crm_mon -1f) Clone Set: cloneFs2 Started: [ node1 ] Stopped: [ resFs2:0 ] but fails on the second node with the following messages in the system log: ocfs2_controld[3483]: Unable to open checkpoint "ocfs2:controld": Object does not exist As requested here's the content of /etc/corosync/service.d/ root@node1:[~] # la /etc/corosync/service.d/ total 12 drwxr-xr-x 2 root root 4096 2011-07-05 10:56 . drwxr-xr-x 4 root root 4096 2011-05-31 16:28 .. -rw-r--r-- 1 root root 59 2010-02-18 11:09 ckpt-service -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/799711 Title: o2cb[11796]: ERROR: ocfs2_controld.pcmk did not come up To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ocfs2-tools/+bug/799711/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs