On 3/14/2011 6:18 AM, Dejan Muhamedagic wrote:
> On Fri, Mar 11, 2011 at 10:23:37AM -0800, Randy Katz wrote:
>> On 3/11/2011 3:29 AM, Dejan Muhamedagic wrote:
>>> Hi,
>>>
>>> On Fri, Mar 11, 2011 at 01:36:25AM -0800, Randy Katz wrote:
>>>> On 3/11/2011 12:50 AM, RaSca wrote:
>>>>> Il giorno Ven 11 Mar 2011 07:32:32 CET, Randy Katz ha scritto:
>>>>>> ps - in /var/log/messages I find this:
>>>>>>
>>>>>> Mar 10 22:31:45 drbd1 lrmd: [3274]: ERROR: get_resource_meta: pclose
>>>>>> failed: Interrupted system call
>>>>>> Mar 10 22:31:45 drbd1 lrmd: [3274]: WARN: on_msg_get_metadata: empty
>>>>>> metadata for ocf::linbit::drbd.
>>>>>> Mar 10 22:31:45 drbd1 lrmadmin: [3481]: ERROR:
>>>>>> lrm_get_rsc_type_metadata(578): got a return code HA_FAIL from a reply
>>>>>> message of rmetadata with function get_ret_from_msg.
>>>>> [...]
>>>>>
>>>>> Hi,
>>>>> I think that the message "no such resource agent" is explaining what's
>>>>> the matter.
>>>>> Does the file /usr/lib/ocf/resource.d/linbit/drbd exists? Is the drbd
>>>>> file executable? Have you correctly installed the drbd packages?
>>>>>
>>>>> Check those things, you can try to reinstall drbd.
>>>>>
>>>> Hi
>>>>
>>>> # ls -l /usr/lib/ocf/resource.d/linbit/drbd
>>>> -rwxr-xr-x 1 root root 24523 Jun  4  2010
>>>> /usr/lib/ocf/resource.d/linbit/drbd
>>> Which cluster-glue version do you run?
>>> Try also:
>>>
>>> # lrmadmin -C
>>> # lrmadmin -P ocf drbd
>>> # export OCF_ROOT=/usr/lib/ocf
>>> # /usr/lib/ocf/resource.d/linbit/drbd meta-data
>> I am running from a source build/install as per clusterlabs.org as the
>> rpm's had broken dependencies and
>> would not install. I have now blown away that CentOS (one of them)
>> machine and installed openSUSE as they
>> said everything was included but it seems on 11.3 not on 11.4, on 11.4
>> the install is broken and so now
> I guess that openSUSE would like to hear about it too, just in
> which way it is broken.
>
I did an openSUSE 11.4 install from DVD. I then used zypper to install 
pacemaker heartbeat corosync libpacemaker3. I ended up
with a clusterlabs.repo and older versions and had to break dependency 
for pacemaker or it would not install. I found out later there
are later versions, precompiled in the openSUSE repository, you just 
need to call the specific versions and they will install, I had to
remove the previous as some new dependencies were created. The versions 
I ended up with are:

Name: pacemaker
Version: 1.1.5-3.2
Arch: x86_64
Vendor: openSUSE

Name: libpacemaker3
Version: 1.1.5-3.2
Arch: x86_64
Vendor: openSUSE

Name: heartbeat
Version: 3.0.4-25.28.1
Arch: x86_64
Vendor: openSUSE

Name: corosync
Version: 1.3.0-3.1
Arch: x86_64
Vendor: openSUSE

At this point I was not sure whether to install iet or tgt, I saw some 
examples with tgt so I installed that. So far it looks like I have CRM 
and have mocked up the example from ha-iscsi.pdf (trying to mitigate 
some of the errors, there are errors!). I noticed the floating ip 
addresses do not ping so I added a new set and they ping though the 
original ones do not, perhaps something else in the config is 
prohibiting that. Here are my current crm config commands:

property stonith-enabled="false"
property no-quorum-policy="ignore"
property default-resource-stickiness="200"
primitive res_drbd_iscsivg01 ocf:linbit:drbd params 
drbd_resource="iscsivg01" op monitor interval="10s"
ms ms_drbd_iscsivg01 res_drbd_iscsivg01 meta master-max="1" 
master-node-max="1" clone-max="2" clone-node-max="1" notify="true"
primitive res_drbd_iscsivg02 ocf:linbit:drbd params 
drbd_resource="iscsivg02" op monitor interval="10s"
ms ms_drbd_iscsivg02 res_drbd_iscsivg02 meta clone-max="2" notify="true"
primitive res_ip_alicebob01 ocf:heartbeat:IPaddr2 params 
ip="192.168.1.218" cidr_netmask="24" op monitor interval="10s"
primitive res_ip_alicebob02 ocf:heartbeat:IPaddr2 params 
ip="192.168.1.219" cidr_netmask="24" op monitor interval="10s"
primitive res_ip_c1c201 ocf:heartbeat:IPaddr2 params ip="192.168.1.220" 
cidr_netmask="24" op monitor interval="10s"
primitive res_ip_c1c202 ocf:heartbeat:IPaddr2 params ip="192.168.1.221" 
cidr_netmask="24" op monitor interval="10s"
primitive res_lvm_iscsivg01 ocf:heartbeat:LVM params 
volgrpname="iscsivg01" op monitor interval="30s"
primitive res_lvm_iscsivg02 ocf:heartbeat:LVM params 
volgrpname="iscsivg02" op monitor interval="30s"
primitive res_target_iscsivg01 ocf:heartbeat:iSCSITarget params 
iqn="iqn.2011-03.com.example:storage.example.iscsivg01" tid="1" op 
monitor interval="10s"
primitive res_target_iscsivg02 ocf:heartbeat:iSCSITarget params 
iqn="iqn.2011-03.com.example:storage.example.iscsivg02" tid="2" op 
monitor interval="10s"
primitive res_lu_iscsivg01_lun1 ocf:heartbeat:iSCSILogicalUnit params 
target_iqn="iqn.2011-03.com.example:storage.example.iscsivg01" lun="1" 
path="/dev/iscsivg01/lun1" op monitor interval="10s"
primitive res_lu_iscsivg01_lun2 ocf:heartbeat:iSCSILogicalUnit params 
target_iqn="iqn.2011-03.com.example:storage.example.iscsivg01" lun="2" 
path="/dev/iscsivg01/lun2" op monitor interval="10s"
primitive res_lu_iscsivg02_lun1 ocf:heartbeat:iSCSILogicalUnit params 
target_iqn="iqn.2011-03.com.example:storage.example.iscsivg02" lun="1" 
path="/dev/iscsivg02/lun1" op monitor interval="10s"
primitive res_lu_iscsivg02_lun2 ocf:heartbeat:iSCSILogicalUnit params 
target_iqn="iqn.2011-03.com.example:storage.example.iscsivg02" lun="2" 
path="/dev/iscsivg02/lun2" op monitor interval="10s"
group rg_iscsivg01 res_lvm_iscsivg01 res_target_iscsivg01 
res_lu_iscsivg01_lun1 res_lu_iscsivg01_lun2 res_ip_alicebob01
group rg_iscsivg02 res_lvm_iscsivg02 res_target_iscsivg02 
res_lu_iscsivg02_lun1 res_lu_iscsivg02_lun2 res_ip_alicebob02
order o_drbd_before_iscsivg01 inf: ms_drbd_iscsivg01:promote 
rg_iscsivg01:start
order o_drbd_before_iscsivg02 inf: ms_drbd_iscsivg02:promote 
rg_iscsivg02:start
colocation c_iscsivg01_on_drbd inf: rg_iscsivg01 ms_drbd_iscsivg01:Master
colocation c_iscsivg02_on_drbd inf: rg_iscsivg02 ms_drbd_iscsivg02:Master
commit

I don't think /dev/iscsivg01/lun1 is correct above as I don't see it. 
There are plenty of errors in the logs but I have not tried to figure 
any of them out yet. Strangely 192.168.1.220 and 192.168.1.221 ping fine 
but 192.168.1.218 and 192.168.1.219 do not! I tried to query for a 
connectable iSCSI from an Initiator that I know works and I get 
"Connection Failed. " and 192.168.1.[218-221] so I don't think I have a 
valid iSCSI at this point.

It seems to have started the drbd correctly:

# drbd-overview
   1:r0         Connected Primary/Primary   UpToDate/UpToDate C r-----
   2:iscsivg01  Connected Secondary/Primary UpToDate/UpToDate C r-----

Below is a sample of what occurs in the log quite frequently from c1 (it 
only uses the words bob and/or alice in the ip label so c1/c2 should be fine
as well as c1c201 / c1c202 for the ip labels). I don't know how to 
upgrade the drbd tools though I believe everything is functioning fine 
there:

Mar 15 07:20:22 c1 lrmd: [7196]: debug: perform_ra_op: resetting 
scheduler class to SCHED_OTHER
Mar 15 07:20:29 c1 lrmd: [11089]: debug: rsc:res_drbd_iscsivg01:0:29: 
monitor
Mar 15 07:20:29 c1 lrmd: [7226]: debug: perform_ra_op: resetting 
scheduler class to SCHED_OTHER
Mar 15 07:20:29 c1 lrmd: [11089]: info: RA output: 
(res_drbd_iscsivg01:0:monitor:stderr) DRBD module version: 8.3.9#012   
userland version: 8.3.8#012you should upgrade your drbd tools!
Mar 15 07:20:29 c1 drbd[7226]: DEBUG: iscsivg01: Calling 
/usr/sbin/crm_master -Q -l reboot -v 10000
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: 
init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/cib_rw
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: 
init_client_ipc_comms_nodispatch: Attempting to talk on: 
/var/run/crm/cib_callback
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: cib_native_signon_raw: 
Connection to CIB successful
Mar 15 07:20:29 c1 cib: [11088]: debug: acl_enabled: CIB ACL is disabled
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: query_node_uuidResult 
section <nodes >
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: query_node_uuidResult 
section <node id="c2" type="normal" uname="c2" />
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: query_node_uuidResult 
section <node id="c1" type="normal" uname="c1" />
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: query_node_uuidResult 
section </nodes>
Mar 15 07:20:29 c1 crm_attribute: [7256]: info: determine_host: Mapped 
c1 to c1
Mar 15 07:20:29 c1 crm_attribute: [7256]: info: attrd_lazy_update: 
Connecting to cluster... 5 retries remaining
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: 
init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/attrd
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: attrd_update_delegate: 
Sent update: master-res_drbd_iscsivg01:0=10000 for c1
Mar 15 07:20:29 c1 crm_attribute: [7256]: info: main: Update 
master-res_drbd_iscsivg01:0=10000 sent via attrd
Mar 15 07:20:29 c1 crm_attribute: [7256]: debug: cib_native_signoff: 
Signing out of the CIB Service
Mar 15 07:20:29 c1 attrd: [11090]: debug: attrd_local_callback: update 
message from crm_attribute: master-res_drbd_iscsivg01:0=10000
Mar 15 07:20:29 c1 attrd: [11090]: debug: attrd_local_callback: 
Supplied: 10000, Current: 10000, Stored: 10000
Mar 15 07:20:29 c1 crm_attribute: [7256]: info: crm_xml_cleanup: 
Cleaning up memory from libxml2
Mar 15 07:20:29 c1 drbd[7226]: DEBUG: iscsivg01: Exit code 0
Mar 15 07:20:29 c1 drbd[7226]: DEBUG: iscsivg01: Command output:
Mar 15 07:20:29 c1 lrmd: [11089]: debug: RA output: 
(res_drbd_iscsivg01:0:monitor:stdout)
Mar 15 07:20:32 c1 lrmd: [11089]: debug: rsc:res_ip_c1c201:19: monitor
Mar 15 07:20:32 c1 lrmd: [7266]: debug: perform_ra_op: resetting 
scheduler class to SCHED_OTHER
Mar 15 07:21:30 c1 lrmd: [11089]: info: RA output: 
(res_drbd_iscsivg01:0:monitor:stderr) DRBD module version: 8.3.9#012   
userland version: 8.3.8#012you should upgrade your drbd tools!
Mar 15 07:21:30 c1 drbd[7644]: DEBUG: iscsivg01: Calling 
/usr/sbin/crm_master -Q -l reboot -v 10000
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: 
init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/cib_rw
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: 
init_client_ipc_comms_nodispatch: Attempting to talk on: 
/var/run/crm/cib_callback
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: cib_native_signon_raw: 
Connection to CIB successful
Mar 15 07:21:30 c1 cib: [11088]: debug: acl_enabled: CIB ACL is disabled
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: query_node_uuidResult 
section <nodes >
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: query_node_uuidResult 
section <node id="c2" type="normal" uname="c2" />
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: query_node_uuidResult 
section <node id="c1" type="normal" uname="c1" />
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: query_node_uuidResult 
section </nodes>
Mar 15 07:21:30 c1 crm_attribute: [7674]: info: determine_host: Mapped 
c1 to c1
Mar 15 07:21:30 c1 crm_attribute: [7674]: info: attrd_lazy_update: 
Connecting to cluster... 5 retries remaining
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: 
init_client_ipc_comms_nodispatch: Attempting to talk on: /var/run/crm/attrd
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: attrd_update_delegate: 
Sent update: master-res_drbd_iscsivg01:0=10000 for c1
Mar 15 07:21:30 c1 crm_attribute: [7674]: info: main: Update 
master-res_drbd_iscsivg01:0=10000 sent via attrd
Mar 15 07:21:30 c1 crm_attribute: [7674]: debug: cib_native_signoff: 
Signing out of the CIB Service
Mar 15 07:21:30 c1 attrd: [11090]: debug: attrd_local_callback: update 
message from crm_attribute: master-res_drbd_iscsivg01:0=10000
Mar 15 07:21:30 c1 attrd: [11090]: debug: attrd_local_callback: 
Supplied: 10000, Current: 10000, Stored: 10000
Mar 15 07:21:30 c1 crm_attribute: [7674]: info: crm_xml_cleanup: 
Cleaning up memory from libxml2
Mar 15 07:21:30 c1 drbd[7644]: DEBUG: iscsivg01: Exit code 0
Mar 15 07:21:30 c1 drbd[7644]: DEBUG: iscsivg01: Command output:
Mar 15 07:21:30 c1 lrmd: [11089]: debug: RA output: 
(res_drbd_iscsivg01:0:monitor:stdout)

on node c2 (Primary) I get this:

Mar 15 07:24:01 c2 crmd: [19033]: info: crm_timer_popped: PEngine 
Recheck Timer (I_PE_CALC) just popped! (900000ms)
Mar 15 07:24:01 c2 crmd: [19033]: debug: s_crmd_fsa: Processing 
I_PE_CALC: [ state=S_IDLE cause=C_TIMER_POPPED origin=crm_timer_popped ]
Mar 15 07:24:01 c2 crmd: [19033]: info: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC 
cause=C_TIMER_POPPED origin=crm_timer_popped ]
Mar 15 07:24:01 c2 crmd: [19033]: info: do_state_transition: Progressed 
to state S_POLICY_ENGINE after C_TIMER_POPPED
Mar 15 07:24:01 c2 crmd: [19033]: info: do_state_transition: All 2 
cluster nodes are eligible to run resources.
Mar 15 07:24:01 c2 crmd: [19033]: debug: do_fsa_action: actions:trace: 
#011// A_DC_TIMER_STOP
Mar 15 07:24:01 c2 crmd: [19033]: debug: do_fsa_action: actions:trace: 
#011// A_INTEGRATE_TIMER_STOP
Mar 15 07:24:01 c2 crmd: [19033]: debug: do_fsa_action: actions:trace: 
#011// A_FINALIZE_TIMER_STOP
Mar 15 07:24:01 c2 crmd: [19033]: debug: do_fsa_action: actions:trace: 
#011// A_PE_INVOKE
Mar 15 07:24:01 c2 crmd: [19033]: info: do_pe_invoke: Query 146: 
Requesting the current CIB: S_POLICY_ENGINE
Mar 15 07:24:01 c2 cib: [18983]: debug: acl_enabled: CIB ACL is disabled
Mar 15 07:24:01 c2 crmd: [19033]: info: do_pe_invoke_callback: Invoking 
the PE: query=146, ref=pe_calc-dc-1300199041-120, seq=184, quorate=1
Mar 15 07:24:01 c2 pengine: [19034]: info: unpack_config: Startup 
probes: enabled
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: STONITH 
timeout: 60000
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: STONITH of 
failed nodes is disabled
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: Stop all 
active resources: false
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: Cluster is 
symmetric - resources can run anywhere by default
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_config: Default 
stickiness: 200
Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_config: On loss of 
CCM Quorum: Ignore
Mar 15 07:24:01 c2 pengine: [19034]: info: unpack_config: Node scores: 
'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Mar 15 07:24:01 c2 pengine: [19034]: info: unpack_domains: Unpacking domains
Mar 15 07:24:01 c2 pengine: [19034]: info: determine_online_status: Node 
c1 is online
Mar 15 07:24:01 c2 pengine: [19034]: info: determine_online_status: Node 
c2 is online
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: 
res_drbd_iscsivg01:0_monitor_0 on c1 returned 8 (master) instead of the 
expected value: 7 (not running)
Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Operation 
res_drbd_iscsivg01:0_monitor_0 found resource res_drbd_iscsivg01:0 
active in master mode on c1
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: 
res_drbd_iscsivg02:1_start_0 on c1 returned 5 (not installed) instead of 
the expected value: 0 (ok)
Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Hard error - 
res_drbd_iscsivg02:1_start_0 failed with rc=5: Preventing 
ms_drbd_iscsivg02 from re-starting on c1
Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing 
failed op res_drbd_iscsivg02:1_start_0 on c1: not installed (5)
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: 
res_drbd_iscsivg02:1_stop_0 on c1 returned 5 (not installed) instead of 
the expected value: 0 (ok)
Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Hard error - 
res_drbd_iscsivg02:1_stop_0 failed with rc=5: Preventing 
ms_drbd_iscsivg02 from re-starting on c1
Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing 
failed op res_drbd_iscsivg02:1_stop_0 on c1: not installed (5)
Mar 15 07:24:01 c2 pengine: [19034]: info: native_add_running: resource 
res_drbd_iscsivg02:1 isnt managed
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: 
res_lvm_iscsivg01_start_0 on c1 returned 1 (unknown error) instead of 
the expected value: 0 (ok)
Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing 
failed op res_lvm_iscsivg01_start_0 on c1: unknown error (1)
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: 
res_drbd_iscsivg01:1_monitor_0 on c2 returned 0 (ok) instead of the 
expected value: 7 (not running)
Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Operation 
res_drbd_iscsivg01:1_monitor_0 found resource res_drbd_iscsivg01:1 
active on c2
Mar 15 07:24:01 c2 pengine: [19034]: debug: find_clone: Created orphan 
for ms_drbd_iscsivg02: res_drbd_iscsivg02:1 on c2
Mar 15 07:24:01 c2 pengine: [19034]: info: find_clone: Internally 
renamed res_drbd_iscsivg02:1 on c2 to res_drbd_iscsivg02:2 (ORPHAN)
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: 
res_drbd_iscsivg02:0_start_0 on c2 returned 5 (not installed) instead of 
the expected value: 0 (ok)
Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Hard error - 
res_drbd_iscsivg02:0_start_0 failed with rc=5: Preventing 
ms_drbd_iscsivg02 from re-starting on c2
Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing 
failed op res_drbd_iscsivg02:0_start_0 on c2: not installed (5)
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: 
res_drbd_iscsivg02:0_stop_0 on c2 returned 5 (not installed) instead of 
the expected value: 0 (ok)
Mar 15 07:24:01 c2 pengine: [19034]: notice: unpack_rsc_op: Hard error - 
res_drbd_iscsivg02:0_stop_0 failed with rc=5: Preventing 
ms_drbd_iscsivg02 from re-starting on c2
Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing 
failed op res_drbd_iscsivg02:0_stop_0 on c2: not installed (5)
Mar 15 07:24:01 c2 pengine: [19034]: info: native_add_running: resource 
res_drbd_iscsivg02:0 isnt managed
Mar 15 07:24:01 c2 pengine: [19034]: debug: unpack_rsc_op: 
res_lvm_iscsivg01_start_0 on c2 returned 1 (unknown error) instead of 
the expected value: 0 (ok)
Mar 15 07:24:01 c2 pengine: [19034]: WARN: unpack_rsc_op: Processing 
failed op res_lvm_iscsivg01_start_0 on c2: unknown error (1)
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: 
res_ip_c1c201#011(ocf::heartbeat:IPaddr2):#011Started c1
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print: 
res_ip_c1c202#011(ocf::heartbeat:IPaddr2):#011Started c2
Mar 15 07:24:01 c2 pengine: [19034]: notice: group_print:  Resource 
Group: rg_iscsivg01
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_lvm_iscsivg01#011(ocf::heartbeat:LVM):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_target_iscsivg01#011(ocf::heartbeat:iSCSITarget):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_lu_iscsivg01_lun1#011(ocf::heartbeat:iSCSILogicalUnit):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_lu_iscsivg01_lun2#011(ocf::heartbeat:iSCSILogicalUnit):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_ip_alicebob01#011(ocf::heartbeat:IPaddr2):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: group_print:  Resource 
Group: rg_iscsivg02
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_lvm_iscsivg02#011(ocf::heartbeat:LVM):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_target_iscsivg02#011(ocf::heartbeat:iSCSITarget):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_lu_iscsivg02_lun1#011(ocf::heartbeat:iSCSILogicalUnit):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_lu_iscsivg02_lun2#011(ocf::heartbeat:iSCSILogicalUnit):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_ip_alicebob02#011(ocf::heartbeat:IPaddr2):#011Stopped
Mar 15 07:24:01 c2 pengine: [19034]: notice: clone_print:  Master/Slave 
Set: ms_drbd_iscsivg01 [res_drbd_iscsivg01]
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource 
res_drbd_iscsivg01:0 active on c1
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource 
res_drbd_iscsivg01:0 active on c1
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource 
res_drbd_iscsivg01:1 active on c2
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource 
res_drbd_iscsivg01:1 active on c2
Mar 15 07:24:01 c2 pengine: [19034]: notice: short_print:      Masters: 
[ c2 ]
Mar 15 07:24:01 c2 pengine: [19034]: notice: short_print:      Slaves: [ 
c1 ]
Mar 15 07:24:01 c2 pengine: [19034]: notice: clone_print:  Master/Slave 
Set: ms_drbd_iscsivg02 [res_drbd_iscsivg02]
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource 
res_drbd_iscsivg02:0 active on c2
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_drbd_iscsivg02:0#011(ocf::linbit:drbd):#011Slave c2 (unmanaged) FAILED
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_active: Resource 
res_drbd_iscsivg02:1 active on c1
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_drbd_iscsivg02:1#011(ocf::linbit:drbd):#011Slave c1 (unmanaged) FAILED
Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: 
Resource res_ip_c1c202: preferring current location (node=c2, weight=200)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
res_lvm_iscsivg01 has failed INFINITY times on c2
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing res_lvm_iscsivg01 away from c2 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: 
Resource res_drbd_iscsivg01:1: preferring current location (node=c2, 
weight=1)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
ms_drbd_iscsivg02 has failed INFINITY times on c2
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
ms_drbd_iscsivg02 has failed INFINITY times on c2
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
ms_drbd_iscsivg02 has failed INFINITY times on c2
Mar 15 07:24:01 c2 pengine: [19034]: notice: native_print:      
res_drbd_iscsivg02:1#011(ocf::linbit:drbd):#011Slave c1 (unmanaged) FAILED
Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: 
Resource res_ip_c1c202: preferring current location (node=c2, weight=200)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
res_lvm_iscsivg01 has failed INFINITY times on c2
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing res_lvm_iscsivg01 away from c2 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: 
Resource res_drbd_iscsivg01:1: preferring current location (node=c2, 
weight=1)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
ms_drbd_iscsivg02 has failed INFINITY times on c2
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
ms_drbd_iscsivg02 has failed INFINITY times on c2
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
ms_drbd_iscsivg02 has failed INFINITY times on c2
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing ms_drbd_iscsivg02 away from c2 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: 
Resource res_ip_c1c201: preferring current location (node=c1, weight=200)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
res_lvm_iscsivg01 has failed INFINITY times on c1
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing res_lvm_iscsivg01 away from c1 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: debug: common_apply_stickiness: 
Resource res_drbd_iscsivg01:0: preferring current location (node=c1, 
weight=1)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
ms_drbd_iscsivg02 has failed INFINITY times on c1
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing ms_drbd_iscsivg02 away from c1 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
ms_drbd_iscsivg02 has failed INFINITY times on c1
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing ms_drbd_iscsivg02 away from c1 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: info: get_failcount: 
ms_drbd_iscsivg02 has failed INFINITY times on c1
Mar 15 07:24:01 c2 pengine: [19034]: WARN: common_apply_stickiness: 
Forcing ms_drbd_iscsivg02 away from c1 after 1000000 failures (max=1000000)
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: 
Assigning c1 to res_ip_c1c201
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: 
Assigning c2 to res_ip_c1c202
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: 
Assigning c1 to res_drbd_iscsivg01:0
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: 
Assigning c2 to res_drbd_iscsivg01:1
Mar 15 07:24:01 c2 pengine: [19034]: debug: clone_color: Allocated 2 
ms_drbd_iscsivg01 instances of a possible 2
Mar 15 07:24:01 c2 pengine: [19034]: debug: master_color: 
res_drbd_iscsivg01:1 master score: 10000
Mar 15 07:24:01 c2 pengine: [19034]: info: master_color: Promoting 
res_drbd_iscsivg01:1 (Master c2)
Mar 15 07:24:01 c2 pengine: [19034]: debug: master_color: 
res_drbd_iscsivg01:0 master score: 10000
Mar 15 07:24:01 c2 pengine: [19034]: info: master_color: 
ms_drbd_iscsivg01: Promoted 1 instances of a possible 1 to master
Mar 15 07:24:01 c2 pengine: [19034]: info: rsc_merge_weights: 
res_lvm_iscsivg01: Rolling back scores from res_target_iscsivg01
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: All 
nodes for resource res_lvm_iscsivg01 are unavailable, unclean or 
shutting down (c1: 1, -1000000)
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Could 
not allocate a node for res_lvm_iscsivg01
Mar 15 07:24:01 c2 pengine: [19034]: info: native_color: Resource 
res_lvm_iscsivg01 cannot run anywhere
Mar 15 07:24:01 c2 pengine: [19034]: info: rsc_merge_weights: 
res_target_iscsivg01: Rolling back scores from res_lu_iscsivg01_lun1
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: All 
nodes for resource res_target_iscsivg01 are unavailable, unclean or 
shutting down (c1: 1, -1000000)
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Could 
not allocate a node for res_target_iscsivg01
Mar 15 07:24:01 c2 pengine: [19034]: info: native_color: Resource 
res_target_iscsivg01 cannot run anywhere
Mar 15 07:24:01 c2 pengine: [19034]: info: rsc_merge_weights: 
res_lu_iscsivg01_lun1: Rolling back scores from res_lu_iscsivg01_lun2
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: All 
nodes for resource res_lu_iscsivg01_lun1 are unavailable, unclean or 
shutting down (c1: 1, -1000000)
Mar 15 07:24:01 c2 pengine: [19034]: debug: native_assign_node: Could 
not allocate a node for res_lu_iscsivg01_lun1
Mar 15 07:24:01 c2 pengine: [19034]: info: native_color: Resource 
res_lu_iscsivg01_lun1 cannot run anywhere

and it keeps going...

Any help will be greatly appreciated,
Thanks,
Randy


_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to