On 05/12/2011 07:04 AM, Dan Frincu wrote: > Hi, > > When using the same hostname on 2 nodes (debian squeeze, corosync > 1.3.0-3 from unstable) the following happens: > > May 12 08:36:27 debian cib: [3125]: info: cib_process_request: Operation > complete: op cib_sync for section 'all' (origin=local/crmd/84, > version=0.5.1): ok (rc=0) > May 12 08:36:27 debian crmd: [3129]: info: crm_get_peer: Node debian now > has id: 620757002 > May 12 08:36:27 debian crmd: [3129]: info: do_state_transition: State > transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED > cause=C_FSA_INTERNAL origin=check_join_state ] > May 12 08:36:27 debian crmd: [3129]: info: do_state_transition: All 1 > cluster nodes responded to the join offer. > May 12 08:36:27 debian crmd: [3129]: info: do_dc_join_finalize: join-29: > Syncing the CIB from debian to the rest of the cluster > May 12 08:36:27 debian crmd: [3129]: info: crm_get_peer: Node debian now > has id: 603979786 > May 12 08:36:27 debian crmd: [3129]: info: do_state_transition: State > transition S_FINALIZE_JOIN -> S_INTEGRATION [ input=I_JOIN_REQUEST > cause=C_HA_MESSAGE origin=route_message ] > May 12 08:36:27 debian crmd: [3129]: info: update_dc: Unset DC debian > May 12 08:36:27 debian cib: [3125]: info: cib_process_request: Operation > complete: op cib_sync for section 'all' (origin=local/crmd/86, > version=0.5.1): ok (rc=0) > May 12 08:36:27 debian crmd: [3129]: info: do_dc_join_offer_all: > join-30: Waiting on 1 outstanding join acks > May 12 08:36:27 debian crmd: [3129]: info: update_dc: Set DC to debian > (3.0.1) > May 12 08:36:27 debian crmd: [3129]: info: crm_get_peer: Node debian now > has id: 620757002 > May 12 08:36:27 debian crmd: [3129]: info: do_state_transition: State > transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED > cause=C_FSA_INTERNAL origin=check_join_state ] > May 12 08:36:27 debian crmd: [3129]: info: do_state_transition: All 1 > cluster nodes responded to the join offer. > May 12 08:36:27 debian crmd: [3129]: info: do_dc_join_finalize: join-30: > Syncing the CIB from debian to the rest of the cluster > May 12 08:36:27 debian crmd: [3129]: info: crm_get_peer: Node debian now > has id: 603979786 > May 12 08:36:27 debian crmd: [3129]: info: do_state_transition: State > transition S_FINALIZE_JOIN -> S_INTEGRATION [ input=I_JOIN_REQUEST > cause=C_HA_MESSAGE origin=route_message ] > May 12 08:36:27 debian crmd: [3129]: info: update_dc: Unset DC debian > May 12 08:36:27 debian crmd: [3129]: info: do_dc_join_offer_all: > join-31: Waiting on 1 outstanding join acks > May 12 08:36:27 debian crmd: [3129]: info: update_dc: Set DC to debian > (3.0.1) > May 12 08:36:27 debian cib: [3125]: info: cib_process_request: Operation > complete: op cib_sync for section 'all' (origin=local/crmd/88, > version=0.5.1): ok (rc=0) > May 12 08:36:27 debian crmd: [3129]: info: crm_get_peer: Node debian now > has id: 620757002 > May 12 08:36:27 debian crmd: [3129]: info: do_state_transition: State > transition S_INTEGRATION -> S_FINALIZE_JOIN [ input=I_INTEGRATED > cause=C_FSA_INTERNAL origin=check_join_state ] > May 12 08:36:27 debian crmd: [3129]: info: do_state_transition: All 1 > cluster nodes responded to the join offer. > May 12 08:36:27 debian crmd: [3129]: info: do_dc_join_finalize: join-31: > Syncing the CIB from debian to the rest of the cluster > May 12 08:36:27 debian crmd: [3129]: info: crm_get_peer: Node debian now > has id: 603979786 > May 12 08:36:27 debian crmd: [3129]: info: do_state_transition: State > transition S_FINALIZE_JOIN -> S_INTEGRATION [ input=I_JOIN_REQUEST > cause=C_HA_MESSAGE origin=route_message ] > May 12 08:36:27 debian crmd: [3129]: info: update_dc: Unset DC debian > May 12 08:36:27 debian crmd: [3129]: info: do_dc_join_offer_all: > join-32: Waiting on 1 outstanding join acks > May 12 08:36:27 debian crmd: [3129]: info: update_dc: Set DC to debian > (3.0.1) > > Basically it goes into an endless loop. This is a improperly configured > option, but it would help the users if there was a handling of this or a > relevant message printed in the logfile, such as "duplicate hostname found". >
Dan, I believe this is a pacemaker RFE. corosync operates entirely on IP addresses and never does any hostname to IP resolution (because the resolver can block and cause bad things to happen). > Regards. > Dan > > -- > Dan Frincu > CCNA, RHCE > > > > _______________________________________________ > Openais mailing list > open...@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker