Hi. We had a Pacemaker cluster running OCFS2 filesystem over a DRBD device and we completely lost one of the hosts. Now I need some help to recover the data on the remaining machine. I was able to load the DRBD module by hand bring up the devices using the drbdadm command line: apolo:~ # modprobe drbd apolo:~ # cat /proc/drbd version: 8.3.9 (api:88/proto:86-95) srcversion: A67EB2D25C5AFBFF3D8B788
apolo:~ # drbd-overview 0:backup 1:export apolo:~ # drbdadm attach backup apolo:~ # drbdadm attach export apolo:~ # drbd-overview 0:backup StandAlone Secondary/Unknown UpToDate/DUnknown r----- 1:export StandAlone Secondary/Unknown UpToDate/DUnknown r----- apolo:~ # drbdadm primary backup apolo:~ # drbdadm primary export apolo:~ # drbd-overview 0:backup StandAlone Primary/Unknown UpToDate/DUnknown r----- 1:export StandAlone Primary/Unknown UpToDate/DUnknown r----- We have these resources and constraints configured: primitive resDLM ocf:pacemaker:controld \ op monitor interval="120s" primitive resDRBD_0 ocf:linbit:drbd \ params drbd_resource="backup" \ operations $id="resDRBD_0-operations" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" \ op monitor interval="20" role="Master" timeout="20" \ op monitor interval="30" role="Slave" timeout="20" primitive resDRBD_1 ocf:linbit:drbd \ params drbd_resource="export" \ operations $id="resDRBD_1-operations" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" \ op monitor interval="20" role="Master" timeout="20" \ op monitor interval="30" role="Slave" timeout="20" primitive resFS_BACKUP ocf:heartbeat:Filesystem \ params device="/dev/drbd/by-res/backup" directory="/backup" fstype="ocfs2" options="rw,noatime" \ op monitor interval="120s" primitive resFS_EXPORT ocf:heartbeat:Filesystem \ params device="/dev/drbd/by-res/export" directory="/export" fstype="ocfs2" options="rw,noatime" \ op monitor interval="120s" primitive resO2CB ocf:ocfs2:o2cb \ op monitor interval="120s" group DRBD_01 resDRBD_0 resDRBD_1 ms msDRBD_01 DRBD_01 \ meta resource-stickines="100" notify="true" master-max="2" interleave="true" target-role="Started" clone cloneDLM resDLM \ meta globally-unique="false" interleave="true" target-role="Started" clone cloneFS_BACKUP resFS_BACKUP \ meta interleave="true" ordered="true" target-role="Started" clone cloneFS_EXPORT resFS_EXPORT \ meta interleave="true" ordered="true" target-role="Started" clone cloneO2CB resO2CB \ meta globally-unique="false" interleave="true" target-role="Started" colocation colDLMDRBD inf: cloneDLM msDRBD_01:Master colocation colFS_BACKUP-O2CB inf: cloneFS_BACKUP cloneO2CB colocation colFS_EXPORT-O2CB inf: cloneFS_EXPORT cloneO2CB colocation colO2CBDLM inf: cloneO2CB cloneDLM order ordDLMO2CB 0: cloneDLM cloneO2CB order ordDRBDDLM 0: msDRBD_01:promote cloneDLM:start order ordO2CB-FS_BACKUP 0: cloneO2CB cloneFS_BACKUP order ordO2CB-FS_EXPORT 0: cloneO2CB cloneFS_EXPORT As the DRBD devices were brought up by hand, Pacemaker doesn't recognize they are up and so it doesn't start the DLM resource and all resources that depends on it stay stopped. Is there any way I can circumvent this issue? Is it possible to bring the OCFS2 resources working on standalone mode? Please, any help will be very welcome. Best regards, Carlos. _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org