On Tue, 25 Jun 2013 13:34:30 -0400 (EDT) Jake Smith <jsm...@argotec.com> wrote:
> I'm guessing if you run: > grep user /sys/fs/ocfs2/loaded_cluster_plugins 2>&1; rc=$? > your going to return a 1 or something other than 0. Hi Jake, true, as long as ocfs2_stack_user isn't loaded. I loaded it and now the command returns 0. But now I'm not able to mount the cluster-drive anymore. Corosync fails to start o2cb/ocfs2 and if I start it by hand the stack mismatches. Below you will find the corosync logs: Jun 26 10:32:13 test4 corosync[3101]: [MAIN ] Corosync Cluster Engine ('1.4.2'): started and ready to provide service. Jun 26 10:32:13 test4 corosync[3101]: [MAIN ] Corosync built-in features: nss Jun 26 10:32:13 test4 corosync[3101]: [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'. Jun 26 10:32:13 test4 corosync[3101]: [TOTEM ] Initializing transport (UDP/IP Multicast). Jun 26 10:32:13 test4 corosync[3101]: [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0). Jun 26 10:32:13 test4 corosync[3101]: [TOTEM ] The network interface [10.0.2.18] is now up. Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: process_ais_conf: Reading configure Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: config_find_init: Local handle: 5650605097994944515 for logging Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: config_find_next: Processing additional logging options... Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: get_config_opt: Found 'off' for option: debug Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: get_config_opt: Found 'no' for option: to_logfile Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: get_config_opt: Found 'yes' for option: to_syslog Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: get_config_opt: Found 'daemon' for option: syslog_facility Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: config_find_init: Local handle: 2730409743423111172 for quorum Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: config_find_next: No additional configuration supplied for: quorum Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: get_config_opt: No default for option: provider Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: config_find_init: Local handle: 5880381755227111429 for service Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: config_find_next: Processing additional service options... Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: get_config_opt: Found '0' for option: ver Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: get_config_opt: Defaulting to 'pcmk' for option: clustername Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: get_config_opt: Defaulting to 'no' for option: use_logd Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: get_config_opt: Defaulting to 'no' for option: use_mgmtd Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: pcmk_startup: CRM: Initialized Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] Logging: Initialized pcmk_startup Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615 Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: pcmk_startup: Service: 9 Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: pcmk_startup: Local hostname: test4-node1 Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: pcmk_update_nodeid: Local node id: 302120970 Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: update_member: Creating entry for node 302120970 born on 0 Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: update_member: 0xe28c00 Node 302120970 now known as test4-node1 (was: (null)) Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: update_member: Node test4-node1 now has 1 quorum votes (was 0) Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: update_member: Node 302120970/test4-node1 is now: member Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: spawn_child: Forked child 3132 for process cib Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: spawn_child: Forked child 3133 for process stonith-ng Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: spawn_child: Forked child 3134 for process lrmd Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: spawn_child: Forked child 3135 for process attrd Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: spawn_child: Forked child 3136 for process pengine Jun 26 10:32:13 test4 corosync[3101]: [pcmk ] info: spawn_child: Forked child 3137 for process crmd Jun 26 10:32:13 test4 corosync[3101]: [SERV ] Service engine loaded: Pacemaker Cluster Manager 1.1.7 Jun 26 10:32:14 test4 pengine: [3136]: info: Invoked: /usr/lib/pacemaker/pengine Jun 26 10:32:14 test4 corosync[3101]: [SERV ] Service engine loaded: openais checkpoint service B.01.01 Jun 26 10:32:14 test4 attrd: [3135]: info: Invoked: /usr/lib/pacemaker/attrd Jun 26 10:32:14 test4 attrd: [3135]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin) Jun 26 10:32:14 test4 crmd: [3137]: info: Invoked: /usr/lib/pacemaker/crmd Jun 26 10:32:14 test4 cib: [3132]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster Jun 26 10:32:14 test4 crmd: [3137]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/hacluster Jun 26 10:32:14 test4 crmd: [3137]: notice: main: CRM Git Version: ee0730e13d124c3d58f00016c3376a1de5323cff Jun 26 10:32:14 test4 cib: [3132]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig) Jun 26 10:32:14 test4 corosync[3101]: [SERV ] Service engine loaded: corosync extended virtual synchrony service Jun 26 10:32:14 test4 corosync[3101]: [SERV ] Service engine loaded: corosync configuration service Jun 26 10:32:14 test4 corosync[3101]: [SERV ] Service engine loaded: corosync cluster closed process group service v1.01 Jun 26 10:32:14 test4 corosync[3101]: [SERV ] Service engine loaded: corosync cluster config database access v1.01 Jun 26 10:32:14 test4 stonith-ng: [3133]: info: Invoked: /usr/lib/pacemaker/stonithd Jun 26 10:32:14 test4 stonith-ng: [3133]: info: crm_log_init_worker: Changed active directory to /var/lib/heartbeat/cores/root Jun 26 10:32:14 test4 stonith-ng: [3133]: info: get_cluster_type: Cluster type is: 'openais' Jun 26 10:32:14 test4 stonith-ng: [3133]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin) Jun 26 10:32:14 test4 stonith-ng: [3133]: info: init_ais_connection_classic: Creating connection to our Corosync plugin Jun 26 10:32:14 test4 corosync[3101]: [SERV ] Service engine loaded: corosync profile loading service Jun 26 10:32:14 test4 corosync[3101]: [SERV ] Service engine loaded: corosync cluster quorum service v0.1 Jun 26 10:32:14 test4 corosync[3101]: [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine. Jun 26 10:32:14 test4 stonith-ng: [3133]: info: init_ais_connection_classic: AIS connection established Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: pcmk_ipc: Recorded connection 0xe32e10 for attrd/3135 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: pcmk_ipc: Recorded connection 0xe37170 for stonith-ng/3133 Jun 26 10:32:14 test4 attrd: [3135]: notice: main: Starting mainloop... Jun 26 10:32:14 test4 stonith-ng: [3133]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk Jun 26 10:32:14 test4 stonith-ng: [3133]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established Jun 26 10:32:14 test4 stonith-ng: [3133]: info: crm_new_peer: Node test4-node1 now has id: 302120970 Jun 26 10:32:14 test4 stonith-ng: [3133]: info: crm_new_peer: Node 302120970 is now known as test4-node1 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 56: memb=0, new=0, lost=0 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 56: memb=3, new=3, lost=0 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: pcmk_peer_update: NEW: test4-node1 302120970 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: Creating entry for node 302186506 born on 56 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: Node 302186506/unknown is now: member Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: pcmk_peer_update: NEW: .pending. 302186506 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: Creating entry for node 302252042 born on 56 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: Node 302252042/unknown is now: member Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: pcmk_peer_update: NEW: .pending. 302252042 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: pcmk_peer_update: MEMB: test4-node1 302120970 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: pcmk_peer_update: MEMB: .pending. 302186506 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: pcmk_peer_update: MEMB: .pending. 302252042 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: Node test4-node1 now has process list: 00000000000000000000000000111312 (1118994) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: send_member_notification: Sending membership update 56 to 0 children Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: 0xe28c00 Node 302120970 ((null)) born on: 56 Jun 26 10:32:14 test4 corosync[3101]: [TOTEM ] A processor joined or left the membership and a new membership was formed. Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: 0xe3ddb0 Node 302186506 (test4-node2) born on: 52 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: 0xe3ddb0 Node 302186506 now known as test4-node2 (was: (null)) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: Node test4-node2 now has process list: 00000000000000000000000000111312 (1118994) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: Node test4-node2 now has 1 quorum votes (was 0) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: send_member_notification: Sending membership update 56 to 0 children Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: 0xe3dfe0 Node 302252042 (test4) born on: 8 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: 0xe3dfe0 Node 302252042 now known as test4 (was: (null)) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: Node test4 now has process list: 00000000000000000000000000111312 (1118994) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_member: Node test4 now has 1 quorum votes (was 0) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: update_expected_votes: Expected quorum votes 2 -> 3 Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] info: send_member_notification: Sending membership update 56 to 0 children Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 cib: [3132]: info: validate_with_relaxng: Creating RNG parser context Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 corosync[3101]: [pcmk ] WARN: route_ais_message: Sending message to local.cib failed: ipc delivery failed (rc=-2) Jun 26 10:32:14 test4 corosync[3101]: [CPG ] chosen downlist: sender r(0) ip(10.0.3.18) ; members(old:2 left:0) Jun 26 10:32:14 test4 corosync[3101]: [MAIN ] Completed service synchronization, ready to provide service. Jun 26 10:32:15 test4 lrmd: [3134]: info: enabling coredumps Jun 26 10:32:15 test4 lrmd: [3134]: WARN: Core dumps could be lost if multiple dumps occur. Jun 26 10:32:15 test4 lrmd: [3134]: WARN: Consider setting non-default value in /proc/sys/kernel/core_pattern (or equivalent) for maximum supportability Jun 26 10:32:15 test4 lrmd: [3134]: WARN: Consider setting /proc/sys/kernel/core_uses_pid (or equivalent) to 1 for maximum supportability Jun 26 10:32:15 test4 lrmd: [3134]: info: Started. Jun 26 10:32:15 test4 cib: [3132]: info: startCib: CIB Initialization completed successfully Jun 26 10:32:15 test4 cib: [3132]: info: get_cluster_type: Cluster type is: 'openais' Jun 26 10:32:15 test4 cib: [3132]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin) Jun 26 10:32:15 test4 cib: [3132]: info: init_ais_connection_classic: Creating connection to our Corosync plugin Jun 26 10:32:15 test4 cib: [3132]: info: init_ais_connection_classic: AIS connection established Jun 26 10:32:15 test4 corosync[3101]: [pcmk ] info: pcmk_ipc: Recorded connection 0xe3e670 for cib/3132 Jun 26 10:32:15 test4 corosync[3101]: [pcmk ] info: pcmk_ipc: Sending membership update 56 to cib Jun 26 10:32:15 test4 cib: [3132]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk Jun 26 10:32:15 test4 cib: [3132]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node test4-node1 now has id: 302120970 Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node 302120970 is now known as test4-node1 Jun 26 10:32:15 test4 cib: [3132]: info: cib_init: Starting cib mainloop Jun 26 10:32:15 test4 cib: [3132]: notice: ais_dispatch_message: Membership 56: quorum acquired Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node test4 now has id: 302252042 Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node 302252042 is now known as test4 Jun 26 10:32:15 test4 cib: [3132]: info: crm_update_peer: Node test4: id=302252042 state=member (new) addr=r(0) ip(10.0.4.18) votes=1 born=8 seen=56 proc=00000000000000000000000000111312 Jun 26 10:32:15 test4 cib: [3132]: info: crm_update_peer: Node test4-node1: id=302120970 state=member (new) addr=r(0) ip(10.0.2.18) (new) votes=1 (new) born=56 seen=56 proc=00000000000000000000000000111312 (new) Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node test4-node2 now has id: 302186506 Jun 26 10:32:15 test4 cib: [3132]: info: crm_new_peer: Node 302186506 is now known as test4-node2 Jun 26 10:32:15 test4 cib: [3132]: info: crm_update_peer: Node test4-node2: id=302186506 state=member (new) addr=r(0) ip(10.0.3.18) votes=1 born=52 seen=56 proc=00000000000000000000000000111312 Jun 26 10:32:15 test4 crmd: [3137]: info: do_cib_control: CIB connection established Jun 26 10:32:15 test4 crmd: [3137]: info: get_cluster_type: Cluster type is: 'openais' Jun 26 10:32:15 test4 crmd: [3137]: notice: crm_cluster_connect: Connecting to cluster infrastructure: classic openais (with plugin) Jun 26 10:32:15 test4 crmd: [3137]: info: init_ais_connection_classic: Creating connection to our Corosync plugin Jun 26 10:32:15 test4 crmd: [3137]: info: init_ais_connection_classic: AIS connection established Jun 26 10:32:15 test4 corosync[3101]: [pcmk ] info: pcmk_ipc: Recorded connection 0xe43070 for crmd/3137 Jun 26 10:32:15 test4 corosync[3101]: [pcmk ] info: pcmk_ipc: Sending membership update 56 to crmd Jun 26 10:32:15 test4 crmd: [3137]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk Jun 26 10:32:15 test4 crmd: [3137]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node test4-node1 now has id: 302120970 Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node 302120970 is now known as test4-node1 Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4-node1 is now unknown Jun 26 10:32:15 test4 crmd: [3137]: info: do_ha_control: Connected to the cluster Jun 26 10:32:15 test4 crmd: [3137]: info: do_started: Delaying start, no membership data (0000000000100000) Jun 26 10:32:15 test4 crmd: [3137]: notice: ais_dispatch_message: Membership 56: quorum acquired Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node test4 now has id: 302252042 Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node 302252042 is now known as test4 Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4 is now unknown Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4 is now member (was unknown) Jun 26 10:32:15 test4 crmd: [3137]: info: crm_update_peer: Node test4: id=302252042 state=member (new) addr=r(0) ip(10.0.4.18) votes=1 born=8 seen=56 proc=00000000000000000000000000111312 Jun 26 10:32:15 test4 crmd: [3137]: notice: crmd_peer_update: Status update: Client test4-node1/crmd now has status [online] (DC=<null>) Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4-node1 is now member (was unknown) Jun 26 10:32:15 test4 crmd: [3137]: info: crm_update_peer: Node test4-node1: id=302120970 state=member (new) addr=r(0) ip(10.0.2.18) (new) votes=1 (new) born=56 seen=56 proc=00000000000000000000000000111312 (new) Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node test4-node2 now has id: 302186506 Jun 26 10:32:15 test4 crmd: [3137]: info: crm_new_peer: Node 302186506 is now known as test4-node2 Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4-node2 is now unknown Jun 26 10:32:15 test4 crmd: [3137]: info: ais_status_callback: status: test4-node2 is now member (was unknown) Jun 26 10:32:15 test4 crmd: [3137]: info: crm_update_peer: Node test4-node2: id=302186506 state=member (new) addr=r(0) ip(10.0.3.18) votes=1 born=52 seen=56 proc=00000000000000000000000000111312 Jun 26 10:32:15 test4 crmd: [3137]: info: do_started: Delaying start, Config not read (0000000000000040) Jun 26 10:32:15 test4 crmd: [3137]: notice: do_started: The local CRM is operational Jun 26 10:32:15 test4 crmd: [3137]: notice: do_state_transition: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ] Jun 26 10:32:15 test4 stonith-ng: [3133]: notice: setup_cib: Watching for stonith topology changes Jun 26 10:32:15 test4 stonith-ng: [3133]: info: main: Starting stonith-ng mainloop Jun 26 10:32:16 test4 crmd: [3137]: info: ais_dispatch_message: Membership 56: quorum retained Jun 26 10:32:17 test4 crmd: [3137]: info: update_dc: Set DC to test4 (3.0.6) Jun 26 10:32:18 test4 crmd: [3137]: info: erase_status_tag: Deleting xpath: //node_state[@uname='test4-node1']/transient_attributes Jun 26 10:32:18 test4 crmd: [3137]: info: update_attrd: Connecting to attrd... Jun 26 10:32:18 test4 crmd: [3137]: notice: do_state_transition: State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ] Jun 26 10:32:18 test4 attrd: [3135]: notice: attrd_local_callback: Sending full refresh (origin=crmd) Jun 26 10:32:18 test4 cib: [3132]: info: cib_replace_notify: Replaced: -1.-1.-1 -> 0.104.145 from test4 Jun 26 10:32:18 test4 lrmd: [3134]: info: rsc:drbd:0 probe[2] (pid 3168) Jun 26 10:32:18 test4 lrmd: [3134]: info: rsc:pingtest:0 probe[3] (pid 3169) Jun 26 10:32:18 test4 lrmd: [3134]: info: rsc:sip probe[4] (pid 3170) Jun 26 10:32:18 test4 lrmd: [3134]: info: rsc:apache probe[5] (pid 3171) Jun 26 10:32:18 test4 lrmd: [3134]: info: operation monitor[3] on pingtest:0 for client 3137: pid 3169 exited with return code 7 Jun 26 10:32:18 test4 crmd: [3137]: info: process_lrm_event: LRM operation pingtest:0_monitor_0 (call=3, rc=7, cib-update=8, confirmed=true) not running Jun 26 10:32:19 test4 lrmd: [3134]: info: rsc:p_controld:0 probe[6] (pid 3294) Jun 26 10:32:19 test4 lrmd: [3134]: info: operation monitor[4] on sip for client 3137: pid 3170 exited with return code 7 Jun 26 10:32:19 test4 crmd: [3137]: info: process_lrm_event: LRM operation sip_monitor_0 (call=4, rc=7, cib-update=9, confirmed=true) not running Jun 26 10:32:19 test4 apache[3171]: INFO: apache not running Jun 26 10:32:19 test4 lrmd: [3134]: info: RA output: (p_controld:0:probe:stderr) dlm_controld.pcmk: no process found Jun 26 10:32:19 test4 lrmd: [3134]: info: operation monitor[6] on p_controld:0 for client 3137: pid 3294 exited with return code 7 Jun 26 10:32:19 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_controld:0_monitor_0 (call=6, rc=7, cib-update=10, confirmed=true) not running Jun 26 10:32:19 test4 lrmd: [3134]: info: operation monitor[5] on apache for client 3137: pid 3171 exited with return code 7 Jun 26 10:32:19 test4 crmd: [3137]: info: process_lrm_event: LRM operation apache_monitor_0 (call=5, rc=7, cib-update=11, confirmed=true) not running Jun 26 10:32:19 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:0 (10000) Jun 26 10:32:19 test4 lrmd: [3134]: info: operation monitor[2] on drbd:0 for client 3137: pid 3168 exited with return code 8 Jun 26 10:32:19 test4 crmd: [3137]: info: process_lrm_event: LRM operation drbd:0_monitor_0 (call=2, rc=8, cib-update=12, confirmed=true) master Jun 26 10:32:19 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: master-drbd:0 (10000) Jun 26 10:32:19 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 4: master-drbd:0=10000 Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:p_o2cb:0 probe[7] (pid 3330) Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:fs_drbd:0 probe[8] (pid 3331) Jun 26 10:32:20 test4 o2cb[3330]: INFO: configfs not mounted Jun 26 10:32:20 test4 lrmd: [3134]: info: operation monitor[7] on p_o2cb:0 for client 3137: pid 3330 exited with return code 7 Jun 26 10:32:20 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_o2cb:0_monitor_0 (call=7, rc=7, cib-update=13, confirmed=true) not running Jun 26 10:32:20 test4 lrmd: [3134]: info: operation monitor[8] on fs_drbd:0 for client 3137: pid 3331 exited with return code 7 Jun 26 10:32:20 test4 crmd: [3137]: info: process_lrm_event: LRM operation fs_drbd:0_monitor_0 (call=8, rc=7, cib-update=14, confirmed=true) not running Jun 26 10:32:20 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true) Jun 26 10:32:20 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 9: probe_complete=true Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:pingtest:0 start[9] (pid 3392) Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:p_controld:0 start[10] (pid 3393) Jun 26 10:32:20 test4 lrmd: [3134]: info: RA output: (p_controld:0:start:stderr) dlm_controld.pcmk: no process found Jun 26 10:32:20 test4 lrmd: [3134]: info: operation start[9] on pingtest:0 for client 3137: pid 3392 exited with return code 0 Jun 26 10:32:20 test4 crmd: [3137]: info: process_lrm_event: LRM operation pingtest:0_start_0 (call=9, rc=0, cib-update=15, confirmed=true) ok Jun 26 10:32:20 test4 lrmd: [3134]: info: rsc:pingtest:0 monitor[11] (pid 3428) Jun 26 10:32:20 test4 lrmd: [3134]: info: operation monitor[11] on pingtest:0 for client 3137: pid 3428 exited with return code 0 Jun 26 10:32:20 test4 crmd: [3137]: info: process_lrm_event: LRM operation pingtest:0_monitor_5000 (call=11, rc=0, cib-update=16, confirmed=false) ok Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: get_cluster_type: Cluster type is: 'openais' Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: init_ais_connection_classic: Creating connection to our Corosync plugin Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: init_ais_connection_classic: AIS connection established Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: crm_new_peer: Node test4-node1 now has id: 302120970 Jun 26 10:32:20 test4 cluster-dlm: [3452]: info: crm_new_peer: Node 302120970 is now known as test4-node1 Jun 26 10:32:20 test4 corosync[3101]: [pcmk ] info: pcmk_notify: Enabling node notifications for child 3452 (0xe47ef0) Jun 26 10:32:20 test4 kernel: [ 37.842184] dlm_controld.pc (3452): /proc/3452/oom_adj is deprecated, please use /proc/3452/oom_score_adj instead. Jun 26 10:32:21 test4 cluster-dlm: [3452]: notice: ais_dispatch_message: Membership 56: quorum acquired Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_new_peer: Node test4 now has id: 302252042 Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_new_peer: Node 302252042 is now known as test4 Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_update_peer: Node test4: id=302252042 state=member (new) addr=r(0) ip(10.0.4.18) votes=1 born=8 seen=56 proc=00000000000000000000000000111312 Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_update_peer: Node test4-node1: id=302120970 state=member (new) addr=r(0) ip(10.0.2.18) (new) votes=1 (new) born=56 seen=56 proc=00000000000000000000000000111312 (new) Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_new_peer: Node test4-node2 now has id: 302186506 Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_new_peer: Node 302186506 is now known as test4-node2 Jun 26 10:32:21 test4 cluster-dlm: [3452]: info: crm_update_peer: Node test4-node2: id=302186506 state=member (new) addr=r(0) ip(10.0.3.18) votes=1 born=52 seen=56 proc=00000000000000000000000000111312 Jun 26 10:32:22 test4 lrmd: [3134]: info: operation start[10] on p_controld:0 for client 3137: pid 3393 exited with return code 0 Jun 26 10:32:22 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_controld:0_start_0 (call=10, rc=0, cib-update=17, confirmed=true) ok Jun 26 10:32:22 test4 lrmd: [3134]: info: rsc:p_o2cb:0 start[12] (pid 3460) Jun 26 10:32:22 test4 o2cb[3460]: INFO: ocfs2 not loaded Jun 26 10:32:22 test4 o2cb[3460]: INFO: Starting p_o2cb:0 Jun 26 10:32:22 test4 kernel: [ 39.321729] OCFS2 Node Manager 1.5.0 Jun 26 10:32:22 test4 kernel: [ 39.363557] OCFS2 1.5.0 Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: get_cluster_type: Cluster type is: 'openais' Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: init_ais_connection_classic: Creating connection to our Corosync plugin Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: init_ais_connection_classic: AIS connection established Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: get_ais_nodeid: Server details: id=302120970 uname=test4-node1 cname=pcmk Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: init_ais_connection_once: Connection to 'classic openais (with plugin)': established Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node test4-node1 now has id: 302120970 Jun 26 10:32:22 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node 302120970 is now known as test4-node1 Jun 26 10:32:22 test4 corosync[3101]: [pcmk ] info: pcmk_notify: Enabling node notifications for child 3495 (0xe56a30) Jun 26 10:32:23 test4 ocfs2_controld: [3495]: notice: ais_dispatch_message: Membership 56: quorum acquired Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node test4 now has id: 302252042 Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node 302252042 is now known as test4 Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_update_peer: Node test4: id=302252042 state=member (new) addr=r(0) ip(10.0.4.18) votes=1 born=8 seen=56 proc=00000000000000000000000000111312 Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_update_peer: Node test4-node1: id=302120970 state=member (new) addr=r(0) ip(10.0.2.18) (new) votes=1 (new) born=56 seen=56 proc=00000000000000000000000000111312 (new) Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node test4-node2 now has id: 302186506 Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_new_peer: Node 302186506 is now known as test4-node2 Jun 26 10:32:23 test4 ocfs2_controld: [3495]: info: crm_update_peer: Node test4-node2: id=302186506 state=member (new) addr=r(0) ip(10.0.3.18) votes=1 born=52 seen=56 proc=00000000000000000000000000111312 Jun 26 10:32:24 test4 lrmd: [3134]: info: operation start[12] on p_o2cb:0 for client 3137: pid 3460 exited with return code 0 Jun 26 10:32:24 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_o2cb:0_start_0 (call=12, rc=0, cib-update=18, confirmed=true) ok Jun 26 10:32:25 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: pingd (1000) Jun 26 10:32:25 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 14: pingd=1000 Jun 26 10:32:27 test4 lrmd: [3134]: info: rsc:fs_drbd:1 start[13] (pid 3538) Jun 26 10:32:27 test4 Filesystem[3538]: INFO: Running start for /dev/drbd0 on /var/www Jun 26 10:32:27 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) FATAL: Module scsi_hostadapter not found. Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) mount.ocfs2 Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) : Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) Cluster stack specified does not match the one currently running Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) while trying to join the group Jun 26 10:32:28 test4 lrmd: [3134]: info: RA output: (fs_drbd:1:start:stderr) Jun 26 10:32:28 test4 Filesystem[3538]: ERROR: Couldn't mount filesystem /dev/drbd0 on /var/www Jun 26 10:32:28 test4 lrmd: [3134]: info: operation start[13] on fs_drbd:1 for client 3137: pid 3538 exited with return code 1 Jun 26 10:32:28 test4 crmd: [3137]: info: process_lrm_event: LRM operation fs_drbd:1_start_0 (call=13, rc=1, cib-update=19, confirmed=true) unknown error Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_ais_dispatch: Update relayed from test4 Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: fail-count-fs_drbd:1 (INFINITY) Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 17: fail-count-fs_drbd:1=INFINITY Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_ais_dispatch: Update relayed from test4 Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_trigger_update: Sending flush op to all hosts for: last-failure-fs_drbd:1 (1372235547) Jun 26 10:32:28 test4 attrd: [3135]: notice: attrd_perform_update: Sent update 20: last-failure-fs_drbd:1=1372235547 Jun 26 10:32:28 test4 lrmd: [3134]: info: rsc:fs_drbd:1 stop[14] (pid 3610) Jun 26 10:32:28 test4 Filesystem[3610]: INFO: Running stop for /dev/drbd0 on /var/www Jun 26 10:32:29 test4 lrmd: [3134]: info: operation stop[14] on fs_drbd:1 for client 3137: pid 3610 exited with return code 0 Jun 26 10:32:29 test4 crmd: [3137]: info: process_lrm_event: LRM operation fs_drbd:1_stop_0 (call=14, rc=0, cib-update=20, confirmed=true) ok Jun 26 10:32:29 test4 lrmd: [3134]: info: rsc:p_o2cb:0 stop[15] (pid 3651) Jun 26 10:32:29 test4 ocfs2_controld: kill node 302186506 - ocfs2_controld PROCDOWN Jun 26 10:32:29 test4 stonith-ng: [3133]: info: initiate_remote_stonith_op: Initiating remote operation off for 302186506: 1be401d4-547d-4f59-b380-a5e996c70a31 Jun 26 10:32:29 test4 stonith-ng: [3133]: info: stonith_command: Processed st_query from test4-node1: rc=0 Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node test4 now has id: 302252042 Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node 302252042 is now known as test4 Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node test4-node2 now has id: 302186506 Jun 26 10:32:29 test4 stonith-ng: [3133]: info: crm_new_peer: Node 302186506 is now known as test4-node2 Jun 26 10:32:29 test4 o2cb[3651]: INFO: Stopping p_o2cb:0 Jun 26 10:32:29 test4 o2cb[3651]: INFO: Stopping ocfs2_controld.pcmk Jun 26 10:32:29 test4 lrmd: [3134]: info: RA output: (p_o2cb:0:stop:stderr) /usr/lib/ocf/resource.d//pacemaker/o2cb: line 171: killproc: command not found Jun 26 10:32:30 test4 lrmd: [3134]: info: operation stop[15] on p_o2cb:0 for client 3137: pid 3651 exited with return code 0 Jun 26 10:32:30 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_o2cb:0_stop_0 (call=15, rc=0, cib-update=21, confirmed=true) ok Jun 26 10:32:30 test4 lrmd: [3134]: info: rsc:p_controld:0 stop[16] (pid 3700) Jun 26 10:32:30 test4 dlm_controld.pcmk: [3452]: notice: terminate_ais_connection: Disconnecting from Corosync Jun 26 10:32:30 test4 kernel: [ 47.319346] dlm: closing connection to node 302186506 Jun 26 10:32:30 test4 kernel: [ 47.319361] dlm: closing connection to node 302120970 Jun 26 10:32:30 test4 kernel: [ 47.319371] dlm: closing connection to node 302252042 Jun 26 10:32:31 test4 lrmd: [3134]: info: RA output: (p_controld:0:stop:stderr) dlm_controld.pcmk: no process found Jun 26 10:32:32 test4 lrmd: [3134]: info: operation stop[16] on p_controld:0 for client 3137: pid 3700 exited with return code 0 Jun 26 10:32:32 test4 crmd: [3137]: info: process_lrm_event: LRM operation p_controld:0_stop_0 (call=16, rc=0, cib-update=22, confirmed=true) ok Jun 26 10:32:41 test4 stonith-ng: [3133]: ERROR: remote_op_done: Operation off of 302186506 by <no-one> for test4-node1[54fdc547-e40b-45ba-be79-2bdf1d021e7d]: Operation timed out Jun 26 10:32:41 test4 ocfs2_controld: Unknown result when kicking node 302186506 from the cluster Jun 26 10:32:41 test4 crmd: [3137]: notice: tengine_stonith_notify: Peer 302186506 was not terminated (off) by <anyone> for test4-node1: Operation timed out (ref=1be401d4-547d-4f59-b380-a5e996c70a31) Jun 26 10:32:41 test4 ocfs2_controld: Error from dlmc_fs_result: Invalid argument Jun 26 10:32:41 test4 ocfs2_controld: dlmcontrol connection died Jun 26 10:32:41 test4 ocfs2_controld: [3495]: notice: terminate_ais_connection: Disconnecting from Corosync > Also in the above thread the 4th to last post from Andreas Kurz > @Hastexo mentions this: > > This message was immediately followed by "Wrong stack" errors, > > and > check the content of /sysfs/fs/ocfs2/loaded_cluster_plugins ... > and if you have that configfile and it contains the value "user" this > is a good sign you have started ocfs2/o2cb via init It says user but the init-scripts are still disabled and not used. Also it says user since I loaded the ocfs2_stack_user module, before that change it was ocfs2. I also tried reformatting the cluster-drive, now I'm able to mount it by hand, but corosync still fails. Best regards Denis Witt _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org