[Pacemaker] Installation problems

Erich Weiler Sun, 07 Mar 2010 08:36:12 -0800

Hi Y'all,

I'm having some issues getting things running on a stock CentOS 5.4install, and I was hoping someone could point me in the right direction...

Through the epel and clusterlabs repos that are referenced in the wiki,I installed:


corosync-1.2.0-1.el5
openais-1.1.0-1.el5
pacemaker-1.0.7-4.el5
(and all dependencies, via yum)

and it all installed fine, according to yum. I installed/etc/corosync/corosync.conf as follows:


-----
# Please read the corosync.conf.5 manual page
compatibility: whitetank

aisexec {
       user:   root
       group:  root
}

totem {
       version: 2

       # How long before declaring a token lost (ms)
       token:          5000

       # How many token retransmits before forming a new configuration
       token_retransmits_before_loss_const: 20

       # How long to wait for join messages in the membership protocol (ms)
       join:           1000

# How long to wait for consensus to be achieved before startinga new round of membership configuration (ms)

       consensus:      7500

       # Turn off the virtual synchrony filter
       vsftype:        none

# Number of messages that may be sent by one processor onreceipt of the token

       max_messages:   20

       # Disable encryption
       secauth:        off

       # How many threads to use for encryption/decryption
       threads:        0

       # Limit generated nodeids to 31-bits (positive signed integers)
       clear_node_high_bit: yes

       # Optionally assign a fixed node id (integer)
       # nodeid:         1234
       interface {
               ringnumber: 0
bindnetaddr: 10.1.0.255
mcastaddr: 226.94.1.90
mcastport: 4000
       }
}

logging {
       fileline: off
       to_stderr: yes
       to_logfile: yes
       to_syslog: yes
       logfile: /var/log/corosync.log
       debug: off
       timestamp: on
       logger_subsys {
               subsys: AMF
               debug: off
       }
}

amf {
       mode: disabled
}

service {
       # Load the Pacemaker Cluster Resource Manager
       name: pacemaker
       ver:  0
}
-----

Then I tried:

# /etc/init.d/corosync start
Starting Corosync Cluster Engine (corosync):               [  OK  ]

but then when I run crm_mon, it hangs here:

"Attempting connection to the cluster...."

and nothing happens.  A 'ps' shows corosync in a weird state:

[r...@server ~]# ps -afe | grep coro
root     12942     1  0 08:20 ?        00:00:00 corosync
root     12947 12942  0 08:20 ?        00:00:00 [corosync] <defunct>
root     12955 12858  0 08:20 pts/0    00:00:00 grep coro

I also tried starting corosync via '/etc/init.d/openais start' afterchanging the line in the /etc/init.d/openais script:

exportCOROSYNC_DEFAULT_CONFIG_IFACE="openaisserviceenableexperimental:corosync_parser"

and it seems to start, but crm_mon still can't connect and I still get"Attempting connection to the cluster...." and corosync is in a defunctstate. Has anyone else had this problem? Are the rpms fromepel/clusterlabs not jiving with each other in some way perhaps?


Here is a clip from /var/log/corosync.log:

Mar 07 08:20:04 corosync [MAIN ] Corosync Cluster Engine ('1.2.0'):started and ready to provide service.

Mar 07 08:20:04 corosync [MAIN  ] Corosync built-in features: nss rdma

Mar 07 08:20:04 corosync [MAIN ] Successfully read main configurationfile '/etc/corosync/corosync.conf'.

Mar 07 08:20:04 corosync [TOTEM ] Initializing transport (UDP/IP).

Mar 07 08:20:04 corosync [TOTEM ] Initializing transmit/receivesecurity: libtomcrypt SOBER128/SHA1HMAC (mode 0).Mar 07 08:20:04 corosync [MAIN ] Compatibility mode set to whitetank.Using V1 and V2 of the synchronization engine.Mar 07 08:20:04 corosync [TOTEM ] The network interface [10.1.1.84] isnow up.

Mar 07 08:20:04 corosync [pcmk  ] info: process_ais_conf: Reading configure

Mar 07 08:20:04 corosync [pcmk ] info: config_find_init: Local handle:5650605097994944514 for loggingMar 07 08:20:04 corosync [pcmk ] info: config_find_next: Processingadditional logging options...Mar 07 08:20:04 corosync [pcmk ] info: get_config_opt: Found 'off' foroption: debugMar 07 08:20:04 corosync [pcmk ] info: get_config_opt: Defaulting to'off' for option: to_fileMar 07 08:20:04 corosync [pcmk ] info: get_config_opt: Defaulting to'daemon' for option: syslog_facilityMar 07 08:20:04 corosync [pcmk ] info: config_find_init: Local handle:2730409743423111171 for serviceMar 07 08:20:04 corosync [pcmk ] info: config_find_next: Processingadditional service options...Mar 07 08:20:04 corosync [pcmk ] info: get_config_opt: Defaulting to'pcmk' for option: clusternameMar 07 08:20:04 corosync [pcmk ] info: get_config_opt: Defaulting to'no' for option: use_logdMar 07 08:20:04 corosync [pcmk ] info: get_config_opt: Defaulting to'no' for option: use_mgmtd

Mar 07 08:20:04 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
Mar 07 08:20:04 corosync [pcmk  ] Logging: Initialized pcmk_startup

Mar 07 08:20:04 corosync [pcmk ] info: pcmk_startup: Maximum core filesize is: 18446744073709551615Mar 07 08:20:04 corosync [pcmk ] ERROR: pcmk_startup: Child 12947spawned to record non-fatal assertion failure line 544: pwentry != NULLMar 07 08:20:04 corosync [pcmk ] ERROR: pcmk_startup: Cluster userhacluster does not existMar 07 08:20:04 corosync [SERV ] Service engine loaded: PacemakerCluster Manager 1.0.7Mar 07 08:20:04 corosync [SERV ] Service engine loaded: corosyncextended virtual synchrony serviceMar 07 08:20:04 corosync [SERV ] Service engine loaded: corosyncconfiguration serviceMar 07 08:20:04 corosync [SERV ] Service engine loaded: corosynccluster closed process group service v1.01Mar 07 08:20:04 corosync [SERV ] Service engine loaded: corosynccluster config database access v1.01Mar 07 08:20:04 corosync [SERV ] Service engine loaded: corosyncprofile loading serviceMar 07 08:20:04 corosync [SERV ] Service engine loaded: corosynccluster quorum service v0.1Mar 07 08:20:04 corosync [pcmk ] notice: pcmk_peer_update: Transitionalmembership event on ring 44: memb=0, new=0, lost=0Mar 07 08:20:04 corosync [pcmk ] notice: pcmk_peer_update: Stablemembership event on ring 44: memb=1, new=1, lost=0Mar 07 08:20:04 corosync [pcmk ] info: update_member: Creating entryfor node 1409351946 born on 44Mar 07 08:20:04 corosync [pcmk ] info: update_member: Node1409351946/unknown is now: memberMar 07 08:20:04 corosync [pcmk ] info: pcmk_peer_update: NEW: .pending.1409351946Mar 07 08:20:05 corosync [pcmk ] info: pcmk_peer_update: MEMB:.pending. 1409351946Mar 07 08:20:05 corosync [pcmk ] info: pcmk_update_nodeid: Local nodeid: 1409351946Mar 07 08:20:05 corosync [pcmk ] info: update_member: Node (null) nowhas 1 quorum votes (was 0)Mar 07 08:20:05 corosync [pcmk ] info: send_member_notification:Sending membership update 44 to 0 childrenMar 07 08:20:05 corosync [pcmk ] info: update_member: Node (null) nowhas process list: 00000000000000000000000000000002 (2)Mar 07 08:20:05 corosync [TOTEM ] A processor joined or left themembership and a new membership was formed.Mar 07 08:20:05 corosync [pcmk ] info: update_member: 0xec71ac0 Node1409351946 now known as (was: (null))Mar 07 08:20:05 corosync [pcmk ] info: send_member_notification:Sending membership update 44 to 0 childrenMar 07 08:20:05 corosync [MAIN ] Completed service synchronization,ready to provide service.

Mar 07 08:22:59 corosync [SERV  ] Unloading all Corosync service engines.

Mar 07 08:22:59 corosync [pcmk ] notice: pcmk_shutdown: Shuting downPacemakerMar 07 08:22:59 corosync [pcmk ] notice: pcmk_shutdown: crmd confirmedstoppedMar 07 08:22:59 corosync [pcmk ] notice: pcmk_shutdown: pengineconfirmed stoppedMar 07 08:22:59 corosync [pcmk ] notice: pcmk_shutdown: attrd confirmedstoppedMar 07 08:22:59 corosync [pcmk ] notice: pcmk_shutdown: lrmd confirmedstoppedMar 07 08:22:59 corosync [pcmk ] notice: pcmk_shutdown: cib confirmedstoppedMar 07 08:22:59 corosync [pcmk ] notice: pcmk_shutdown: stonithdconfirmed stopped

Mar 07 08:22:59 corosync [pcmk  ] notice: pcmk_shutdown: Shutdown complete

Mar 07 08:22:59 corosync [SERV ] Service engine unloaded: PacemakerCluster Manager 1.0.7Mar 07 08:22:59 corosync [SERV ] Service engine unloaded: corosyncextended virtual synchrony serviceMar 07 08:22:59 corosync [SERV ] Service engine unloaded: corosyncconfiguration serviceMar 07 08:22:59 corosync [SERV ] Service engine unloaded: corosynccluster closed process group service v1.01Mar 07 08:22:59 corosync [SERV ] Service engine unloaded: corosynccluster config database access v1.01Mar 07 08:22:59 corosync [SERV ] Service engine unloaded: corosyncprofile loading serviceMar 07 08:22:59 corosync [SERV ] Service engine unloaded: corosynccluster quorum service v0.1Mar 07 08:22:59 corosync [MAIN ] Corosync Cluster Engine exiting withstatus -1 at main.c:158.


Any hints welcome!!

TIA,
erich

_______________________________________________
Pacemaker mailing list
Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

[Pacemaker] Installation problems

Reply via email to