Hello Andrew, Thank you so much for your response. I did manage to get an active/active cluster working using cman+pacemaker. Everything works fine except for the occasional error from fenced, and a kernel crash from ocfs2_controld.cman.
>> Not true. SLES/openSUSE has supported cman-free clusters and cluster >> filesystems >> for many years. I believe I have most of the pieces needed to build a pcmk only active/active, (i.e., pcmk + corosync/openais, standard dlm_controld, and ocfs2_controld.pcmk). When attempting to start the cluster: aisexec /etc/init.d/pacemaker start root 1189 0.3 1.3 62980 3400 ? Ssl 17:56 0:06 corosync root 1205 0.0 0.6 13824 1668 pts/0 S 17:57 0:00 pacemakerd root 1209 0.0 0.9 11112 2484 ? Ss 17:57 0:00 \_ /usr/lib/heartbeat/stonithd 999 1210 0.0 1.5 12180 4020 ? Ss 17:57 0:00 \_ /usr/lib/heartbeat/cib root 1211 0.0 0.7 5444 1812 ? Ss 17:57 0:00 \_ /usr/lib/heartbeat/lrmd 999 1212 0.0 1.0 11444 2620 ? Ss 17:57 0:00 \_ /usr/lib/heartbeat/attrd 999 1213 0.0 0.8 7428 2120 ? Ss 17:57 0:00 \_ /usr/lib/heartbeat/pengine 999 1214 0.0 1.1 15612 2892 ? Ss 17:57 0:00 \_ /usr/lib/heartbeat/crmd # mount -t configfs none /sys/kernel/config # dlm_controld -D logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/dlm_controld.log dlm_controld 3.1.7 started cman_admin_init error 2 /sys/kernel/config/dlm/cluster/comms: opendir failed: 2 /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2 # ocfs2_controld.pcmk -D ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: Processing additional service options... ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found 'openais_clm' for option: name ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: Processing additional service options... ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found 'openais_evt' for option: name ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: Processing additional service options... ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found 'openais_ckpt' for option: name ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: Processing additional service options... ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found 'openais_amf_v2' for option: name ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: Processing additional service options... ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found 'openais_msg' for option: name ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: Processing additional service options... ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found 'openais_lck' for option: name ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: Processing additional service options... ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found 'openais_tmr' for option: name ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: No additional configuration supplied for: service ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: No additional configuration supplied for: quorum ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: No default for option: provider ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_cluster_type: Detected an active 'corosync' cluster ocfs2_controld[1786]: 2011/11/20_18:36:27 info: init_ais_connection_once: Connection to 'corosync': established ocfs2_controld[1786]: 2011/11/20_18:36:27 info: crm_new_peer: Node astdrbd2 now has id: 6 ocfs2_controld[1786]: 2011/11/20_18:36:27 info: crm_new_peer: Node 6 is now known as astdrbd2 ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: crm_abort: send_ais_text: Triggered assert at corosync.c:352 : dest != crm_msg_ais Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: send_ais_text: Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: crm_abort: send_ais_text: Triggered assert at corosync.c:352 : dest != crm_msg_ais Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: send_ais_text: Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) 1321832187 setup_stack@170: Cluster connection established. Local node id: 6 1321832187 setup_stack@174: Added Pacemaker as client 1 with fd -1 It is with the help of the LHA community that enabled me to progress as much as I have, and I do really apreicate it. I wish I could disclose more information on why we don't just use binaries included in a distro for a pcmk only active/active however, all I am entitled to say is that it is a requirement to achieve this using source built from scratch. Kind Regards, Nick. _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems