Yes Andres, I'm Dave! To answer your questions: /etc/default/corosync - YES start at boot /etc/corosync/corosync.conf identical between nodes, looks OK/intact and is dated1 month ago. authkeys md5sum identical and dated 1 month ago 400 root.root perms
Both nodes are reporting same with crm_mon so no reason to think comms is a problem (e.g. auth or multicast bad) Last few lines of log from node2 as crmd died: ....... Nov 17 10:55:26 node2 crmd: [22808]: info: crm_timer_popped: Wait Timer (I_NULL) just popped! Nov 17 10:55:26 node2 crmd: [22808]: WARN: lrm_signon: can not initiate connection Nov 17 10:55:26 node2 crmd: [22808]: ERROR: do_lrm_control: Failed to sign on to the LRM 30 (max) times Nov 17 10:55:26 node2 crmd: [22808]: ERROR: do_log: FSA: Input I_ERROR from do_lrm_control() received in state S_STARTING Nov 17 10:55:26 node2 crmd: [22808]: info: do_state_transition: State transition S_STARTING -> S_RECOVERY [ input=I_ERROR cause=C_FSA_INTERNAL origin=do_lrm_control ] Nov 17 10:55:26 node2 crmd: [22808]: ERROR: do_recover: Action A_RECOVER (0000000001000000) not supported Nov 17 10:55:26 node2 crmd: [22808]: ERROR: do_started: Start cancelled... S_RECOVERY Nov 17 10:55:26 node2 crmd: [22808]: ERROR: do_log: FSA: Input I_TERMINATE from do_recover() received in state S_RECOVERY Nov 17 10:55:26 node2 crmd: [22808]: info: do_state_transition: State transition S_RECOVERY -> S_TERMINATE [ input=I_TERMINATE cause=C_FSA_INTERNAL origin=do_recover ] Nov 17 10:55:26 node2 crmd: [22808]: info: do_ha_control: Disconnected from OpenAIS Nov 17 10:55:26 node2 crmd: [22808]: info: do_cib_control: Disconnecting CIB Nov 17 10:55:26 node2 crmd: [22808]: info: crmd_cib_connection_destroy: Connection to the CIB terminated... Nov 17 10:55:26 node2 crmd: [22808]: info: do_exit: Performing A_EXIT_0 - gracefully exiting the CRMd Nov 17 10:55:26 node2 crmd: [22808]: ERROR: do_exit: Could not recover from internal error Nov 17 10:55:26 node2 crmd: [22808]: info: free_mem: Dropping I_TERMINATE: [ state=S_TERMINATE cause=C_FSA_INTERNAL origin=do_stop ] Nov 17 10:55:26 node2 cib: [20601]: WARN: send_ipc_message: IPC Channel to 22808 is not connected Nov 17 10:55:26 node2 cib: [20601]: WARN: send_via_callback_channel: Delivery of reply to client 22808/3210be2d-0165-4c05-8c43-8945feea0692 failed Nov 17 10:55:26 node2 crmd: [22808]: info: do_exit: [crmd] stopped (2) Nov 17 10:55:26 node2 cib: [20601]: WARN: do_local_notify: A-Sync reply to crmd failed: reply failed Nov 17 10:55:26 node2 corosync[20586]: [pcmk ] info: pcmk_ipc_exit: Client crmd (conn=0x1d98d00, async-conn=0x1d98d00) left Nov 17 10:55:27 node2 corosync[20586]: [pcmk ] ERROR: pcmk_wait_dispatch: Child process crmd exited (pid=22808, rc=2) Nov 17 10:55:27 node2 corosync[20586]: [pcmk ] pcmk_wait_dispatch: Call to wait4(crmd) failed: (10) No child processes Nov 17 10:55:27 node2 corosync[20586]: [pcmk ] ERROR: pcmk_wait_dispatch: Child respawn count exceeded by crmd Nov 17 10:55:27 node2 corosync[20586]: [pcmk ] info: update_member: Node node2 now has process list: 00000000000000000000000000011112 (69906) -- do_lrm_control: Failed to sign on to the LRM after upgrade to Maverick https://bugs.launchpad.net/bugs/676391 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs