On 3/14/2011 6:18 AM, Dejan Muhamedagic wrote: > On Fri, Mar 11, 2011 at 10:23:37AM -0800, Randy Katz wrote: >> On 3/11/2011 3:29 AM, Dejan Muhamedagic wrote: >>> Hi, >>> >>> On Fri, Mar 11, 2011 at 01:36:25AM -0800, Randy Katz wrote: >>>> On 3/11/2011 12:50 AM, RaSca wrote: >>>>> Il giorno Ven 11 Mar 2011 07:32:32 CET, Randy Katz ha scritto: >>>>>> ps - in /var/log/messages I find this: >>>>>> >>>>>> Mar 10 22:31:45 drbd1 lrmd: [3274]: ERROR: get_resource_meta: pclose >>>>>> failed: Interrupted system call >>>>>> Mar 10 22:31:45 drbd1 lrmd: [3274]: WARN: on_msg_get_metadata: empty >>>>>> metadata for ocf::linbit::drbd. >>>>>> Mar 10 22:31:45 drbd1 lrmadmin: [3481]: ERROR: >>>>>> lrm_get_rsc_type_metadata(578): got a return code HA_FAIL from a reply >>>>>> message of rmetadata with function get_ret_from_msg. >>>>> [...] >>>>> >>>>> Hi, >>>>> I think that the message "no such resource agent" is explaining what's >>>>> the matter. >>>>> Does the file /usr/lib/ocf/resource.d/linbit/drbd exists? Is the drbd >>>>> file executable? Have you correctly installed the drbd packages? >>>>> >>>>> Check those things, you can try to reinstall drbd. >>>>> >>>> Hi >>>> >>>> # ls -l /usr/lib/ocf/resource.d/linbit/drbd >>>> -rwxr-xr-x 1 root root 24523 Jun 4 2010 >>>> /usr/lib/ocf/resource.d/linbit/drbd >>> Which cluster-glue version do you run? >>> Try also: >>> >>> # lrmadmin -C >>> # lrmadmin -P ocf drbd >>> # export OCF_ROOT=/usr/lib/ocf >>> # /usr/lib/ocf/resource.d/linbit/drbd meta-data >> I am running from a source build/install as per clusterlabs.org as the >> rpm's had broken dependencies and >> would not install. I have now blown away that CentOS (one of them) >> machine and installed openSUSE as they >> said everything was included but it seems on 11.3 not on 11.4, on 11.4 >> the install is broken and so now > I guess that openSUSE would like to hear about it too, just in > which way it is broken. > I did an openSUSE 11.4 install from DVD. I then used zypper to install pacemaker heartbeat corosync libpacemaker3. I ended up with a clusterlabs.repo and older versions and had to break dependency for pacemaker or it would not install. I found out later there are later versions, precompiled in the openSUSE repository, you just need to call the specific versions and they will install, I had to remove the previous as some new dependencies were created. The versions I ended up with are:
Name: pacemaker Version: 1.1.5-3.2 Arch: x86_64 Vendor: openSUSE Name: libpacemaker3 Version: 1.1.5-3.2 Arch: x86_64 Vendor: openSUSE Name: heartbeat Version: 3.0.4-25.28.1 Arch: x86_64 Vendor: openSUSE Name: corosync Version: 1.3.0-3.1 Arch: x86_64 Vendor: openSUSE At this point I was not sure whether to install iet or tgt, I saw some examples with tgt so I installed that. So far it looks like I have CRM and have mocked up the example from ha-iscsi.pdf (trying to mitigate some of the errors, there are errors!). I noticed the floating ip addresses do not ping so I added a new set and they ping though the original ones do not, perhaps something else in the config is prohibiting that. Here are my current crm config commands: property stonith-enabled="false" property no-quorum-policy="ignore" property default-resource-stickiness="200" primitive res_drbd_iscsivg01 ocf:linbit:drbd params drbd_resource="iscsivg01" op monitor interval="10s" ms ms_drbd_iscsivg01 res_drbd_iscsivg01 meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" primitive res_drbd_iscsivg02 ocf:linbit:drbd params drbd_resource="iscsivg02" op monitor interval="10s" ms ms_drbd_iscsivg02 res_drbd_iscsivg02 meta clone-max="2" notify="true" primitive res_ip_alicebob01 ocf:heartbeat:IPaddr2 params ip="192.168.1.218" cidr_netmask="24" op monitor interval="10s" primitive res_ip_alicebob02 ocf:heartbeat:IPaddr2 params ip="192.168.1.219" cidr_netmask="24" op monitor interval="10s" primitive res_ip_c1c201 ocf:heartbeat:IPaddr2 params ip="192.168.1.220" cidr_netmask="24" op monitor interval="10s" primitive res_ip_c1c202 ocf:heartbeat:IPaddr2 params ip="192.168.1.221" cidr_netmask="24" op monitor interval="10s" primitive res_lvm_iscsivg01 ocf:heartbeat:LVM params volgrpname="iscsivg01" op monitor interval="30s" primitive res_lvm_iscsivg02 ocf:heartbeat:LVM params volgrpname="iscsivg02" op monitor interval="30s" primitive res_target_iscsivg01 ocf:heartbeat:iSCSITarget params iqn="iqn.2011-03.com.example:storage.example.iscsivg01" tid="1" op monitor interval="10s" primitive res_target_iscsivg02 ocf:heartbeat:iSCSITarget params iqn="iqn.2011-03.com.example:storage.example.iscsivg02" tid="2" op monitor interval="10s" primitive res_lu_iscsivg01_lun1 ocf:heartbeat:iSCSILogicalUnit params target_iqn="iqn.2011-03.com.example:storage.example.iscsivg01" lun="1" path="/dev/iscsivg01/lun1" op monitor interval="10s" primitive res_lu_iscsivg01_lun2 ocf:heartbeat:iSCSILogicalUnit params target_iqn="iqn.2011-03.com.example:storage.example.iscsivg01" lun="2" path="/dev/iscsivg01/lun2" op monitor interval="10s" primitive res_lu_iscsivg02_lun1 ocf:heartbeat:iSCSILogicalUnit params target_iqn="iqn.2011-03.com.example:storage.example.iscsivg02" lun="1" path="/dev/iscsivg02/lun1" op monitor interval="10s" primitive res_lu_iscsivg02_lun2 ocf:heartbeat:iSCSILogicalUnit params target_iqn="iqn.2011-03.com.example:storage.example.iscsivg02" lun="2" path="/dev/iscsivg02/lun2" op monitor interval="10s" group rg_iscsivg01 res_lvm_iscsivg01 res_target_iscsivg01 res_lu_iscsivg01_lun1 res_lu_iscsivg01_lun2 res_ip_alicebob01 group rg_iscsivg02 res_lvm_iscsivg02 res_target_iscsivg02 res_lu_iscsivg02_lun1 res_lu_iscsivg02_lun2 res_ip_alicebob02 order o_drbd_before_iscsivg01 inf: ms_drbd_iscsivg01:promote rg_iscsivg01:start order o_drbd_before_iscsivg02 inf: ms_drbd_iscsivg02:promote rg_iscsivg02:start colocation c_iscsivg01_on_drbd inf: rg_iscsivg01 ms_drbd_iscsivg01:Master colocation c_iscsivg02_on_drbd inf: rg_iscsivg02 ms_drbd_iscsivg02:Master commit I don't think /dev/iscsivg01/lun1 is correct above as I don't see it. There are plenty of errors in the logs but I have not tried to figure any of them out yet. Strangely 192.168.1.220 and 192.168.1.221 ping fine but 192.168.1.218 and 192.168.1.219 do not! I tried to query for a connectable iSCSI from an Initiator that I know works and I get "Connection Failed. " and 192.168.1.[218-221] so I don't think I have a valid iSCSI at this point. It seems to have started the drbd correctly: # drbd-overview 1:r0 Connected Primary/Primary UpToDate/UpToDate C r----- 2:iscsivg01 Connected Secondary/Primary UpToDate/UpToDate C r----- Below is a sample of what occurs in the log quite frequently from c1 (it only uses the words bob and/or alice in the ip label so c1/c2 should be fine as well as c1c201 / c1c202 for the ip labels). I don't know how to upgrade the drbd tools though I believe everything is functioning fine there: Mar 15 07:20:22 c1 lrmd: [7196]: debug: perform_ra_op: resetting scheduler class to SCHED_OTHER Mar 15 07:20:29 c1 lrmd: [11089]: debug: rsc:res_drbd_iscsivg01:0:29: monitor Mar 15 07:20:29 c1 lrmd: [7226]: debug: perform_ra_op: resetting scheduler class to SCHED_OTHER Mar 15 07:20:29 c1 lrmd: [11089]: info: RA output: (res_drbd_iscsivg01:0:monitor:stderr) DRBD module version: 8.3.9#012 userland version: 8.3.8#012you should upgrade your drbd tools! Mar 15 07:20:29 c1 drbd[7226]: DEBUG: iscsivg01: Calling /usr/sbin/crm_master -Q -l reboot -v 10000 Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/cib_rw Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/cib_callback Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: cib_native_signon_raw: Connection to CIB successful Mar 15 07:20:29 c1 cib: [11088]: debug: acl_enabled: CIB ACL is disabled Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: query_node_uuidResult section <nodes > Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: query_node_uuidResult section <node id="c2" type="normal" uname="c2" /> Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: query_node_uuidResult section <node id="c1" type="normal" uname="c1" /> Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: query_node_uuidResult section </nodes> Mar 15 07:20:29 c1 crm_attribute: [7256]: info: determine_host: Mapped c1 to c1 Mar 15 07:20:29 c1 crm_attribute: [7256]: info: attrd_lazy_update: Connecting to cluster... 5 retries remaining Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/attrd Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: attrd_update_delegate: Sent update: master-res_drbd_iscsivg01:0=10000 for c1 Mar 15 07:20:29 c1 crm_attribute: [7256]: info: main: Update master-res_drbd_iscsivg01:0=10000 sent via attrd Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: cib_native_signoff: Signing out of the CIB Service Mar 15 07:20:29 c1 attrd: [11090]: debug: attrd_local_callback: update message from crm_attribute: master-res_drbd_iscsivg01:0=10000 Mar 15 07:20:29 c1 attrd: [11090]: debug: attrd_local_callback: Supplied: 10000, Current: 10000, Stored: 10000 Mar 15 07:20:29 c1 crm_attribute: [7256]: info: crm_xml_cleanup: Cleaning up memory from libxml2 Mar 15 07:20:29 c1 drbd[7226]: DEBUG: iscsivg01: Exit code 0 Mar 15 07:20:29 c1 drbd[7226]: DEBUG: iscsivg01: Command output: Mar 15 07:20:29 c1 lrmd: [11089]: debug: RA output: (res_drbd_iscsivg01:0:monitor:stdout) Mar 15 07:20:32 c1 lrmd: [11089]: debug: rsc:res_ip_c1c201:19: monitor Mar 15 07:20:32 c1 lrmd: [7266]: debug: perform_ra_op: resetting scheduler class to SCHED_OTHER Mar 15 07:21:30 c1 lrmd: [11089]: info: RA output: (res_drbd_iscsivg01:0:monitor:stderr) DRBD module version: 8.3.9#012 userland version: 8.3.8#012you should upgrade your drbd tools! Mar 15 07:21:30 c1 drbd[7644]: DEBUG: iscsivg01: Calling /usr/sbin/crm_master -Q -l reboot -v 10000 Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/cib_rw Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/cib_callback Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: cib_native_signon_raw: Connection to CIB successful Mar 15 07:21:30 c1 cib: [11088]: debug: acl_enabled: CIB ACL is disabled Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: query_node_uuidResult section <nodes > Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: query_node_uuidResult section <node id="c2" type="normal" uname="c2" /> Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: query_node_uuidResult section <node id="c1" type="normal" uname="c1" /> Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: query_node_uuidResult section </nodes> Mar 15 07:21:30 c1 crm_attribute: [7674]: info: determine_host: Mapped c1 to c1 Mar 15 07:21:30 c1 crm_attribute: [7674]: info: attrd_lazy_update: Connecting to cluster... 5 retries remaining Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/attrd Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: attrd_update_delegate: Sent update: master-res_drbd_iscsivg01:0=10000 for c1 Mar 15 07:21:30 c1 crm_attribute: [7674]: info: main: Update master-res_drbd_iscsivg01:0=10000 sent via attrd Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: cib_native_signoff: Signing out of the CIB Service Mar 15 07:21:30 c1 attrd: [11090]: debug: attrd_local_callback: update message from crm_attribute: master-res_drbd_iscsivg01:0=10000 Mar 15 07:21:30 c1 attrd: [11090]: debug: attrd_local_callback: Supplied: 10000, Current: 10000, Stored: 10000 Mar 15 07:21:30 c1 crm_attribute: [7674]: info: crm_xml_cleanup: Cleaning up memory from libxml2 Mar 15 07:21:30 c1 drbd[7644]: DEBUG: iscsivg01: Exit code 0 Mar 15 07:21:30 c1 drbd[7644]: DEBUG: iscsivg01: Command output: Mar 15 07:21:30 c1 lrmd: [11089]: debug: RA output: (res_drbd_iscsivg01:0:monitor:stdout) on node c2 (Primary) I get this: Mar 15 07:24:01 c2 crmd: [19033]: info: crm_timer_popped: PEngine Recheck Timer (I_PE_CALC) just popped! (900000ms) Mar 15 07:24:01 c2 crmd: [19033]: debug: s_crmd_fsa: Processing I_PE_CALC: [ state=S_IDLE cause=C_TIMER_POPPED origin=crm_timer_popped ] Mar 15 07:24:01 c2 crmd: [19033]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ] Mar 15 07:24:01 c2 crmd: [19033]: info: do_state_transition: Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED Mar 15 07:24:01 c2 crmd: [19033]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Mar 15 07:24:01 c2 crmd: [19033]: debug: do_fsa_action: actions:trace: #011// A_DC_TIMER_STOP Mar 15 07:24:01 c2 crmd: [19033]: debug: do_fsa_action: actions:trace: #011// A_INTEGRATE_TIMER_STOP Mar 15 07:24:01 c2 crmd: [19033]: debug: do_fsa_action: actions:trace: #011// A_FINALIZE_TIMER_STOP Mar 15 07:24:01 c2 crmd: [19033]: debug: do_fsa_action: actions:trace: #011// A_PE_INVOKE Mar 15 07:24:01 c2 crmd: [19033]: info: do_pe_invoke: Query 146: Requesting the current CIB: S_POLICY_ENGINE Mar 15 07:24:01 c2 cib: [18983]: debug: acl_enabled: CIB ACL is disabled Mar 15 07:24:01 c2 crmd: [19033]: info: do_pe_invoke_callback: Invoking the PE: query=146, ref=pe_calc-dc-1300199041-120, seq=184, quorate=1 Mar 15 07:24:01 c2 pengine: [19034]: info: unpack_config: Startup probes: enabled Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: STONITH timeout: 60000 Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: STONITH of failed nodes is disabled Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: Stop all active resources: false Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: Cluster is symmetric - resources can run anywhere by default Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: Default stickiness: 200 Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_config: On loss of CCM Quorum: Ignore Mar 15 07:24:01 c2 pengine: [19034]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 Mar 15 07:24:01 c2 pengine: [19034]: info: unpack_domains: Unpacking domains Mar 15 07:24:01 c2 pengine: [19034]: info: determine_online_status: Node c1 is online Mar 15 07:24:01 c2 pengine: [19034]: info: determine_online_status: Node c2 is online Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: res_drbd_iscsivg01:0_monitor_0 on c1 returned 8 (master) instead of the expected value: 7 (not running) Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Operation res_drbd_iscsivg01:0_monitor_0 found resource res_drbd_iscsivg01:0 active in master mode on c1 Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: res_drbd_iscsivg02:1_start_0 on c1 returned 5 (not installed) instead of the expected value: 0 (ok) Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Hard error - res_drbd_iscsivg02:1_start_0 failed with rc=5: Preventing ms_drbd_iscsivg02 from re-starting on c1 Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing failed op res_drbd_iscsivg02:1_start_0 on c1: not installed (5) Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: res_drbd_iscsivg02:1_stop_0 on c1 returned 5 (not installed) instead of the expected value: 0 (ok) Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Hard error - res_drbd_iscsivg02:1_stop_0 failed with rc=5: Preventing ms_drbd_iscsivg02 from re-starting on c1 Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing failed op res_drbd_iscsivg02:1_stop_0 on c1: not installed (5) Mar 15 07:24:01 c2 pengine: [19034]: info: native_add_running: resource res_drbd_iscsivg02:1 isnt managed Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: res_lvm_iscsivg01_start_0 on c1 returned 1 (unknown error) instead of the expected value: 0 (ok) Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing failed op res_lvm_iscsivg01_start_0 on c1: unknown error (1) Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: res_drbd_iscsivg01:1_monitor_0 on c2 returned 0 (ok) instead of the expected value: 7 (not running) Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Operation res_drbd_iscsivg01:1_monitor_0 found resource res_drbd_iscsivg01:1 active on c2 Mar 15 07:24:01 c2 pengine: [19034]: debug: find_clone: Created orphan for ms_drbd_iscsivg02: res_drbd_iscsivg02:1 on c2 Mar 15 07:24:01 c2 pengine: [19034]: info: find_clone: Internally renamed res_drbd_iscsivg02:1 on c2 to res_drbd_iscsivg02:2 (ORPHAN) Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: res_drbd_iscsivg02:0_start_0 on c2 returned 5 (not installed) instead of the expected value: 0 (ok) Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Hard error - res_drbd_iscsivg02:0_start_0 failed with rc=5: Preventing ms_drbd_iscsivg02 from re-starting on c2 Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing failed op res_drbd_iscsivg02:0_start_0 on c2: not installed (5) Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: res_drbd_iscsivg02:0_stop_0 on c2 returned 5 (not installed) instead of the expected value: 0 (ok) Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Hard error - res_drbd_iscsivg02:0_stop_0 failed with rc=5: Preventing ms_drbd_iscsivg02 from re-starting on c2 Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing failed op res_drbd_iscsivg02:0_stop_0 on c2: not installed (5) Mar 15 07:24:01 c2 pengine: [19034]: info: native_add_running: resource res_drbd_iscsivg02:0 isnt managed Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: res_lvm_iscsivg01_start_0 on c2 returned 1 (unknown error) instead of the expected value: 0 (ok) Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing failed op res_lvm_iscsivg01_start_0 on c2: unknown error (1) Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_ip_c1c201#011(ocf::heartbeat:IPaddr2):#011Started c1 Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_ip_c1c202#011(ocf::heartbeat:IPaddr2):#011Started c2 Mar 15 07:24:01 c2 pengine: [19034]: notice: group_print: Resource Group: rg_iscsivg01 Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_lvm_iscsivg01#011(ocf::heartbeat:LVM):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_target_iscsivg01#011(ocf::heartbeat:iSCSITarget):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_lu_iscsivg01_lun1#011(ocf::heartbeat:iSCSILogicalUnit):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_lu_iscsivg01_lun2#011(ocf::heartbeat:iSCSILogicalUnit):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_ip_alicebob01#011(ocf::heartbeat:IPaddr2):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: group_print: Resource Group: rg_iscsivg02 Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_lvm_iscsivg02#011(ocf::heartbeat:LVM):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_target_iscsivg02#011(ocf::heartbeat:iSCSITarget):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_lu_iscsivg02_lun1#011(ocf::heartbeat:iSCSILogicalUnit):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_lu_iscsivg02_lun2#011(ocf::heartbeat:iSCSILogicalUnit):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_ip_alicebob02#011(ocf::heartbeat:IPaddr2):#011Stopped Mar 15 07:24:01 c2 pengine: [19034]: notice: clone_print: Master/Slave Set: ms_drbd_iscsivg01 [res_drbd_iscsivg01] Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource res_drbd_iscsivg01:0 active on c1 Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource res_drbd_iscsivg01:0 active on c1 Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource res_drbd_iscsivg01:1 active on c2 Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource res_drbd_iscsivg01:1 active on c2 Mar 15 07:24:01 c2 pengine: [19034]: notice: short_print: Masters: [ c2 ] Mar 15 07:24:01 c2 pengine: [19034]: notice: short_print: Slaves: [ c1 ] Mar 15 07:24:01 c2 pengine: [19034]: notice: clone_print: Master/Slave Set: ms_drbd_iscsivg02 [res_drbd_iscsivg02] Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource res_drbd_iscsivg02:0 active on c2 Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_drbd_iscsivg02:0#011(ocf::linbit:drbd):#011Slave c2 (unmanaged) FAILED Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource res_drbd_iscsivg02:1 active on c1 Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_drbd_iscsivg02:1#011(ocf::linbit:drbd):#011Slave c1 (unmanaged) FAILED Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: Resource res_ip_c1c202: preferring current location (node=c2, weight=200) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: res_lvm_iscsivg01 has failed INFINITY times on c2 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing res_lvm_iscsivg01 away from c2 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: Resource res_drbd_iscsivg01:1: preferring current location (node=c2, weight=1) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: ms_drbd_iscsivg02 has failed INFINITY times on c2 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: ms_drbd_iscsivg02 has failed INFINITY times on c2 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: ms_drbd_iscsivg02 has failed INFINITY times on c2 Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: res_drbd_iscsivg02:1#011(ocf::linbit:drbd):#011Slave c1 (unmanaged) FAILED Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: Resource res_ip_c1c202: preferring current location (node=c2, weight=200) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: res_lvm_iscsivg01 has failed INFINITY times on c2 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing res_lvm_iscsivg01 away from c2 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: Resource res_drbd_iscsivg01:1: preferring current location (node=c2, weight=1) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: ms_drbd_iscsivg02 has failed INFINITY times on c2 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: ms_drbd_iscsivg02 has failed INFINITY times on c2 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: ms_drbd_iscsivg02 has failed INFINITY times on c2 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: Resource res_ip_c1c201: preferring current location (node=c1, weight=200) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: res_lvm_iscsivg01 has failed INFINITY times on c1 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing res_lvm_iscsivg01 away from c1 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: Resource res_drbd_iscsivg01:0: preferring current location (node=c1, weight=1) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: ms_drbd_iscsivg02 has failed INFINITY times on c1 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing ms_drbd_iscsivg02 away from c1 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: ms_drbd_iscsivg02 has failed INFINITY times on c1 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing ms_drbd_iscsivg02 away from c1 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: ms_drbd_iscsivg02 has failed INFINITY times on c1 Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: Forcing ms_drbd_iscsivg02 away from c1 after 1000000 failures (max=1000000) Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Assigning c1 to res_ip_c1c201 Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Assigning c2 to res_ip_c1c202 Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Assigning c1 to res_drbd_iscsivg01:0 Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Assigning c2 to res_drbd_iscsivg01:1 Mar 15 07:24:01 c2 pengine: [19034]: debug: clone_color: Allocated 2 ms_drbd_iscsivg01 instances of a possible 2 Mar 15 07:24:01 c2 pengine: [19034]: debug: master_color: res_drbd_iscsivg01:1 master score: 10000 Mar 15 07:24:01 c2 pengine: [19034]: info: master_color: Promoting res_drbd_iscsivg01:1 (Master c2) Mar 15 07:24:01 c2 pengine: [19034]: debug: master_color: res_drbd_iscsivg01:0 master score: 10000 Mar 15 07:24:01 c2 pengine: [19034]: info: master_color: ms_drbd_iscsivg01: Promoted 1 instances of a possible 1 to master Mar 15 07:24:01 c2 pengine: [19034]: info: rsc_merge_weights: res_lvm_iscsivg01: Rolling back scores from res_target_iscsivg01 Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: All nodes for resource res_lvm_iscsivg01 are unavailable, unclean or shutting down (c1: 1, -1000000) Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Could not allocate a node for res_lvm_iscsivg01 Mar 15 07:24:01 c2 pengine: [19034]: info: native_color: Resource res_lvm_iscsivg01 cannot run anywhere Mar 15 07:24:01 c2 pengine: [19034]: info: rsc_merge_weights: res_target_iscsivg01: Rolling back scores from res_lu_iscsivg01_lun1 Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: All nodes for resource res_target_iscsivg01 are unavailable, unclean or shutting down (c1: 1, -1000000) Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Could not allocate a node for res_target_iscsivg01 Mar 15 07:24:01 c2 pengine: [19034]: info: native_color: Resource res_target_iscsivg01 cannot run anywhere Mar 15 07:24:01 c2 pengine: [19034]: info: rsc_merge_weights: res_lu_iscsivg01_lun1: Rolling back scores from res_lu_iscsivg01_lun2 Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: All nodes for resource res_lu_iscsivg01_lun1 are unavailable, unclean or shutting down (c1: 1, -1000000) Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Could not allocate a node for res_lu_iscsivg01_lun1 Mar 15 07:24:01 c2 pengine: [19034]: info: native_color: Resource res_lu_iscsivg01_lun1 cannot run anywhere and it keeps going... Any help will be greatly appreciated, Thanks, Randy _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems