Hello Andrew, Thank you so much for your response. I am using ocfs-tools 1.6.and it only includes pcmk and cman ocfs2 controld:
ocfs2_controld.cman ocfs2_controld.pcmk ocfs2_hb_ctl Which stack provides the standard ocfs2_controld? Thanks for Everything! Nick. If it's cman On Sun, Nov 13, 2011 at 6:49 PM, Andrew Beekhof <and...@beekhof.net> wrote: > On Sat, Nov 12, 2011 at 12:06 AM, Nick Khamis <sym...@gmail.com> wrote: >> Hello Andrew, >> >> I do appologize for this, and really appreciate how far I have got into >> this project thanks to everyone's help. Just as a quick summary: >> >> the patch that you suggested did in fact fix the following (ais.c:346): >> >> ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: crm_abort: >> send_ais_text: Triggered assert at ais.c:346 : dest != crm_msg_ais >> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) >> ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: send_ais_text: >> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) >> ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: crm_abort: >> send_ais_text: Triggered assert at ais.c:346 : dest != crm_msg_ais >> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) >> ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: send_ais_text: >> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) >> 1320247939 setup_stack@170: Cluster connection established. Local node id: 1 >> 1320247939 setup_stack@174: Added Pacemaker as client 1 with fd -1 >> >> The run-time error I am getting now is in (corosync.c:352): >> >> ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node 1 >> is now known as astdrbd1 >> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort: >> send_ais_text: Triggered assert at corosync.c:352 : dest != >> crm_msg_ais >> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) >> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text: >> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) >> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort: >> send_ais_text: Triggered assert at corosync.c:352 : dest != >> crm_msg_ais >> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) >> ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text: >> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) >> 1320352460 setup_stack@170: Cluster connection established. Local node id: 1 >> 1320352460 setup_stack@174: Added Pacemaker as client 1 with fd -1 >> >> >> * The controld RA is using the standard dlm_controld, and this is now >> working. >> * The o2cb RA is using ocfs2_controld.pcmk, and this is where I am running >> into >> the runtime error with corosync.c > > As I mentioned in the last email, you're not supposed to use > ocfs2_controld.pcmk with cman. > You must use the standard ocfs2_controld > >> >>> >>> IMO (and as Florian alluded to in another message), you'd probably save >>> yourself a lot of trouble taking prebuilt packages from a distro where >>> the pieces you need are known to work together. >> >>> Indeed. >> >> There is no resenting that! But I am so close. Actually, I do have things >> working without the o2cb primitive, i.e., pcmk is starting the dual primary >> drbd, cloned dlm, and mounting the cloned ocfs2 filesystem: >> >> root@astdrbd1:~# /etc/init.d/cman start >> Starting cluster: >> Checking if cluster has been disabled at boot... [ OK ] >> Checking Network Manager... [ OK ] >> Global setup... [ OK ] >> Loading kernel modules... [ OK ] >> Mounting configfs... [ OK ] >> Starting cman... [ OK ] >> Waiting for quorum... [ OK ] >> Starting fenced... [ OK ] >> Starting dlm_controld... [ OK ] >> Unfencing self... [ OK ] >> Joining fence domain... [ OK ] >> >> root@astdrbd1:~# /etc/init.d/pacemaker start >> Starting Pacemaker Cluster Manager: touch: missing file operand >> Try `touch --help' for more information. >> [ OK ] >> >> >> ============ >> Last updated: Fri Nov 11 07:36:11 2011 >> Last change: Fri Nov 11 07:33:06 2011 via crmd on astdrbd1 >> Stack: cman >> Current DC: astdrbd1 - partition with quorum >> Version: 1.1.6-2d8fad5 >> 2 Nodes configured, 2 expected votes >> 7 Resources configured. >> ============ >> >> Online: [ astdrbd1 astdrbd2 ] >> >> astIP (ocf::heartbeat:IPaddr2): Started astdrbd1 >> Master/Slave Set: msASTDRBD [astDRBD] >> Masters: [ astdrbd2 astdrbd1 ] >> Clone Set: astDLMClone [astDLM] >> Started: [ astdrbd2 astdrbd1 ] >> Clone Set: astFilesystemClone [astFilesystem] >> Started: [ astdrbd2 astdrbd1 ] >> >> >> Of course, o2cb is not pcmk cluster aware right now and needs to be >> started manually. >> >> Vladislav, if you are getting this I can test if the kernel bug that slows >> down >> ocfs2 reported by you earlier. Is there any test you would like me to >> perform? >> >> >> Kind Regards, >> >> Nick. >> _______________________________________________ >> Linux-HA mailing list >> Linux-HA@lists.linux-ha.org >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems