Hello Andrew,

Thank you so much for your response. I did manage to get an active/active
cluster working using cman+pacemaker. Everything works fine except for the
occasional error from fenced, and a kernel crash from ocfs2_controld.cman.

>> Not true. SLES/openSUSE has supported cman-free clusters and cluster 
>> filesystems
>> for many years.

I believe I have most of the pieces needed to build a pcmk only active/active,
(i.e., pcmk + corosync/openais, standard dlm_controld, and ocfs2_controld.pcmk).

When attempting to start the cluster:

aisexec
/etc/init.d/pacemaker start

root      1189  0.3  1.3  62980  3400 ?        Ssl  17:56   0:06 corosync
root      1205  0.0  0.6  13824  1668 pts/0    S    17:57   0:00 pacemakerd
root      1209  0.0  0.9  11112  2484 ?        Ss   17:57   0:00  \_
/usr/lib/heartbeat/stonithd
999       1210  0.0  1.5  12180  4020 ?        Ss   17:57   0:00  \_
/usr/lib/heartbeat/cib
root      1211  0.0  0.7   5444  1812 ?        Ss   17:57   0:00  \_
/usr/lib/heartbeat/lrmd
999       1212  0.0  1.0  11444  2620 ?        Ss   17:57   0:00  \_
/usr/lib/heartbeat/attrd
999       1213  0.0  0.8   7428  2120 ?        Ss   17:57   0:00  \_
/usr/lib/heartbeat/pengine
999       1214  0.0  1.1  15612  2892 ?        Ss   17:57   0:00  \_
/usr/lib/heartbeat/crmd


# mount -t configfs none /sys/kernel/config
# dlm_controld -D
logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/dlm_controld.log
dlm_controld 3.1.7 started
cman_admin_init error 2
/sys/kernel/config/dlm/cluster/comms: opendir failed: 2
/sys/kernel/config/dlm/cluster/spaces: opendir failed: 2

# ocfs2_controld.pcmk -D

ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
Processing additional service options...
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
'openais_clm' for option: name
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
Processing additional service options...
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
'openais_evt' for option: name
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
Processing additional service options...
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
'openais_ckpt' for option: name
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
Processing additional service options...
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
'openais_amf_v2' for option: name
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
Processing additional service options...
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
'openais_msg' for option: name
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
Processing additional service options...
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
'openais_lck' for option: name
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
Processing additional service options...
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
'openais_tmr' for option: name
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: No
additional configuration supplied for: service
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: No
additional configuration supplied for: quorum
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: No
default for option: provider
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_cluster_type:
Detected an active 'corosync' cluster
ocfs2_controld[1786]: 2011/11/20_18:36:27 info:
init_ais_connection_once: Connection to 'corosync': established
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: crm_new_peer: Node
astdrbd2 now has id: 6
ocfs2_controld[1786]: 2011/11/20_18:36:27 info: crm_new_peer: Node 6
is now known as astdrbd2
ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: crm_abort:
send_ais_text: Triggered assert at corosync.c:352 : dest !=
crm_msg_ais
Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: send_ais_text:
Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: crm_abort:
send_ais_text: Triggered assert at corosync.c:352 : dest !=
crm_msg_ais
Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: send_ais_text:
Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
1321832187 setup_stack@170: Cluster connection established.  Local node id: 6
1321832187 setup_stack@174: Added Pacemaker as client 1 with fd -1

It is with  the help of the LHA community that enabled me to progress
as much as I have, and I do
really apreicate it. I wish I could disclose more information on why
we don't just use binaries included
in a distro for a pcmk only active/active however, all I am entitled
to say is that it is a requirement to
achieve this using source built from scratch.

Kind Regards,

Nick.
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to