[ClusterLabs] Antw: RES: Pacemaker and OCFS2 on stand alone mode
>>> "Carlos Xavier" schrieb am 09.07.2016 um 00:43 >>> in Nachricht <00f201d1d96a$38b76980$aa263c80$@com.br>: > Tank you very much to every one that tryed to help me > >> >> "Carlos Xavier" writes: >> >> > 1467918891 Is dlm missing from kernel? No misc devices found. >> > 1467918891 /sys/kernel/config/dlm/cluster/comms: opendir failed: 2 >> > 1467918891 /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2 >> > 1467918891 No /sys/kernel/config, is configfs loaded? >> > 1467918891 shutdown >> >> Try following the above hints: >> >> modprobe configfs >> modprobe dlm >> mount -t configfs configfs /sys/kernel/config >> > > I tryed those tips, they helped to go ahead, but it wasn't enough to get the > OCFS2 started on snad alone mode in order to recover > the data. > >> and then start the control daemon again. But this is pretty much what the > controld resource should do >> anyway. The main question is why your cluster does not do it by itself. If > you give up after all, >> try this: >> https://www.drbd.org/en/doc/users-guide-83/s-ocfs2-legacy >> -- > > I decided to make the hole install of another machine, to take the place of > the burned one, just to recover de data. One pitfall I had was this: The OCFS stack has to be up when you FORMAT the OCFS filesystem; otherwise it wouldn't mount. (Maybe one could do some post-tweaks, but I just wanted to tell you) > > Once again, many tanks to you. > > Regards, > Carlos > > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Antw: RES: Pacemaker and OCFS2 on stand alone mode
08.07.2016 09:11, Ulrich Windl пишет: "Carlos Xavier" schrieb am 07.07.2016 um 18:57 in > Nachricht <00e901d1d870$ae418000$0ac48000$@com.br>: >> Tank you for the fast reply >> >>> >>> have you configured the stonith and drbd stonith handler? >>> >> >> Yes. they were configured. >> The cluster was running fine for more than 4 years, until we loose one host >> by power supply failure. >> Now I need to access the files on the host that is working. > > Hi, > > MHO: Have you ever tested the configuration? I wonder why the cluster did not > do everything to continue. > Stonith most likely failed if node experience complete power failure. We were not shown cluster state, so it is just guess; but normally the way to recover is to manually declare node as down. Although this does it for pacemaker only; I do not know how to do the same for DRBD (unless pacemaker somehow forwards this information to it). ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: RES: Pacemaker and OCFS2 on stand alone mode
>>> "Carlos Xavier" schrieb am 07.07.2016 um 18:57 >>> in Nachricht <00e901d1d870$ae418000$0ac48000$@com.br>: > Tank you for the fast reply > >> >> have you configured the stonith and drbd stonith handler? >> > > Yes. they were configured. > The cluster was running fine for more than 4 years, until we loose one host > by power supply failure. > Now I need to access the files on the host that is working. Hi, MHO: Have you ever tested the configuration? I wonder why the cluster did not do everything to continue. Regards, Ulrich > >> 2016-07-07 16:43 GMT+02:00 Carlos Xavier : >> > Hi. >> > We had a Pacemaker cluster running OCFS2 filesystem over a DRBD device and > we completely lost one of >> the hosts. >> > Now I need some help to recover the data on the remaining machine. >> > I was able to load the DRBD module by hand bring up the devices using the > drbdadm command line: >> > apolo:~ # modprobe drbd >> > apolo:~ # cat /proc/drbd >> > version: 8.3.9 (api:88/proto:86-95) >> > srcversion: A67EB2D25C5AFBFF3D8B788 >> > >> > apolo:~ # drbd-overview >> > 0:backup >> > 1:export >> > apolo:~ # drbdadm attach backup >> > apolo:~ # drbdadm attach export >> > apolo:~ # drbd-overview >> > 0:backup StandAlone Secondary/Unknown UpToDate/DUnknown r- >> > 1:export StandAlone Secondary/Unknown UpToDate/DUnknown r- >> > apolo:~ # drbdadm primary backup apolo:~ # drbdadm primary export apolo:~ >> > # > drbd-overview >> > 0:backup StandAlone Primary/Unknown UpToDate/DUnknown r- >> > 1:export StandAlone Primary/Unknown UpToDate/DUnknown r- >> > >> > We have these resources and constraints configured: >> > primitive resDLM ocf:pacemaker:controld \ >> > op monitor interval="120s" >> > primitive resDRBD_0 ocf:linbit:drbd \ >> > params drbd_resource="backup" \ >> > operations $id="resDRBD_0-operations" \ >> > op start interval="0" timeout="240" \ >> > op stop interval="0" timeout="100" \ >> > op monitor interval="20" role="Master" timeout="20" \ >> > op monitor interval="30" role="Slave" timeout="20" >> > primitive resDRBD_1 ocf:linbit:drbd \ >> > params drbd_resource="export" \ >> > operations $id="resDRBD_1-operations" \ >> > op start interval="0" timeout="240" \ >> > op stop interval="0" timeout="100" \ >> > op monitor interval="20" role="Master" timeout="20" \ >> > op monitor interval="30" role="Slave" timeout="20" >> > primitive resFS_BACKUP ocf:heartbeat:Filesystem \ >> > params device="/dev/drbd/by-res/backup" directory="/backup" >> > fstype="ocfs2" options="rw,noatime" \ >> > op monitor interval="120s" >> > primitive resFS_EXPORT ocf:heartbeat:Filesystem \ >> > params device="/dev/drbd/by-res/export" directory="/export" >> > fstype="ocfs2" options="rw,noatime" \ >> > op monitor interval="120s" >> > primitive resO2CB ocf:ocfs2:o2cb \ >> > op monitor interval="120s" >> > group DRBD_01 resDRBD_0 resDRBD_1 >> > ms msDRBD_01 DRBD_01 \ >> > meta resource-stickines="100" notify="true" master-max="2" >> > interleave="true" target-role="Started" >> > clone cloneDLM resDLM \ >> > meta globally-unique="false" interleave="true" >> > target-role="Started" >> > clone cloneFS_BACKUP resFS_BACKUP \ >> > meta interleave="true" ordered="true" target-role="Started" >> > clone cloneFS_EXPORT resFS_EXPORT \ >> > meta interleave="true" ordered="true" target-role="Started" >> > clone cloneO2CB resO2CB \ >> > meta globally-unique="false" interleave="true" >> > target-role="Started" >> > colocation colDLMDRBD inf: cloneDLM msDRBD_01:Master colocation >> > colFS_BACKUP-O2CB inf: cloneFS_BACKUP cloneO2CB colocation >> > colFS_EXPORT-O2CB inf: cloneFS_EXPORT cloneO2CB colocation colO2CBDLM inf: > cloneO2CB cloneDLM order >> ordDLMO2CB 0: cloneDLM cloneO2CB order ordDRBDDLM 0: msDRBD_01:promote > cloneDLM:start order ordO2CB- >> FS_BACKUP 0: cloneO2CB cloneFS_BACKUP order ordO2CB-FS_EXPORT 0: >> > cloneO2CB cloneFS_EXPORT >> > >> > As the DRBD devices were brought up by hand, Pacemaker doesn't >> > recognize they are up and so it doesn't start the DLM resource and all > resources that depends on it >> stay stopped. >> > Is there any way I can circumvent this issue? >> > Is it possible to bring the OCFS2 resources working on standalone mode? >> > Please, any help will be very welcome. >> > >> > Best regards, >> > Carlos. >> > >> > > > > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org G