Dear list,

I try to setup a simple 3 Node cluster with a SplitBrainDetector partition 
(which resides temporarly on an USB Stick).

ietd.conf (iscsitarget) looks like this :

Target iqn.2009-09.unibe.ch:myhost.unibe.ch
Lun 1 Path=/dev/sda (which is the USB Stick),Type=fileio,ScsiId=WeissDochNich

On "ctdb-1" which is iscsi initiator I did the following :

ctdb-1:~# sbd -v -d /dev/SplitBrainDetector/sbd create <nooutput>
ctdb-1:~# sbd -v -d /dev/SplitBrainDetector/sbd list <nooutput>
ctdb-1:~# sbd -v -d /dev/SplitBrainDetector/sbd dump
Header version     : 2
Number of slots    : 255
Sector size        : 512
Timeout (watchdog) : 5
Timeout (allocate) : 2
Timeout (loop)     : 1
Timeout (msgwait)  : 10

But I am not able to use the command below :

ctdb-1:~# sbd -v -d /dev/SplitBrainDetector/sbd message ctdb-1 test

no error message appears. Just the syntax of sbd again. I followed the 
instructions on http://www.linux-ha.org/SBD_Fencing

What am I doing wrong ?



iSCSI target is hosted on Debian GNU/Linux 5.0.7 (lenny)
# ietd -v
iscsid version 0.4.16

The iscsi initiator nodes are running Ubuntu 10.10




The consequences of the above seems to be : 

ctdb-1:~# crm configure
crm(live)configure# primitive STONED stonith:external/sbd params 
sbd_device=/dev/SplitBrainDetector/sbd
WARNING: STONED: default timeout 20s for start is smaller than the advised 60
crm(live)configure# end
There are changes pending. Do you want to commit them? y
WARNING: CIB changed in the meantime: won't touch it!
WARNING: STONED: default timeout 20s for start is smaller than the advised 60
Do you still want to commit? y
crm(live)# 

ctdb-1:~$ sudo crm configure show
node ctdb-1
node ctdb-2
node ctdb-3 \
        attributes standby="off"
primitive STONED stonith:external/sbd \
        params sbd_device="/dev/SplitBrainDetector/sbd"
property $id="cib-bootstrap-options" \
        dc-version="1.0.9-unknown" \
        cluster-infrastructure="openais" \
        expected-quorum-votes="3" \
        stonith-enabled="true" \
        stonith-timeout="30s"


Now a bit troubleshooting :

ctdb-1:~$ sudo crm_verify -LV
crm_verify[3623]: 2011/01/12_16:15:31 ERROR: unpack_rsc_op: Hard error - 
STONED_start_0 failed with rc=2: Preventing STONED from re-starting on ctdb-1
crm_verify[3623]: 2011/01/12_16:15:31 WARN: unpack_rsc_op: Processing failed op 
STONED_start_0 on ctdb-1: invalid parameter (2)
crm_verify[3623]: 2011/01/12_16:15:31 ERROR: unpack_rsc_op: Hard error - 
STONED_start_0 failed with rc=2: Preventing STONED from re-starting on ctdb-2
crm_verify[3623]: 2011/01/12_16:15:31 WARN: unpack_rsc_op: Processing failed op 
STONED_start_0 on ctdb-2: invalid parameter (2)
crm_verify[3623]: 2011/01/12_16:15:31 WARN: common_apply_stickiness: Forcing 
STONED away from ctdb-1 after 1000000 failures (max=1000000)
crm_verify[3623]: 2011/01/12_16:15:31 WARN: common_apply_stickiness: Forcing 
STONED away from ctdb-2 after 1000000 failures (max=1000000)
crm_verify[3623]: 2011/01/12_16:15:31 WARN: stage6: Scheduling Node ctdb-3 for 
STONITH
Warnings found during check: config may not be valid


ctdb-1:~$ sudo crm_mon --one-shot --operations
============
Last updated: Wed Jan 12 16:25:27 2011
Stack: openais
Current DC: ctdb-1 - partition with quorum
Version: 1.0.9-unknown
3 Nodes configured, 3 expected votes
1 Resources configured.
============

Node ctdb-3: UNCLEAN (offline)
Online: [ ctdb-1 ctdb-2 ]

 STONED (stonith:external/sbd): Started ctdb-2 FAILED

Operations:
* Node ctdb-1: 
   STONED: migration-threshold=1000000 fail-count=1000000
    + (3) start: rc=2 (invalid parameter)
    + (4) stop: rc=0 (ok)
* Node ctdb-2: 
   STONED: migration-threshold=1000000 fail-count=1000000
    + (3) start: rc=2 (invalid parameter)

Failed actions:
    STONED_start_0 (node=ctdb-1, call=3, rc=2, status=complete): invalid 
parameter
    STONED_start_0 (node=ctdb-2, call=3, rc=2, status=complete): invalid 
parameter

ctdb-1:~$ sudo ptest --live-check -VVV
ptest[3902]: 2011/01/12_16:26:13 ERROR: unpack_rsc_op: Hard error - 
STONED_start_0 failed with rc=2: Preventing STONED from re-starting on ctdb-1
ptest[3902]: 2011/01/12_16:26:13 WARN: unpack_rsc_op: Processing failed op 
STONED_start_0 on ctdb-1: invalid parameter (2)
ptest[3902]: 2011/01/12_16:26:13 ERROR: unpack_rsc_op: Hard error - 
STONED_start_0 failed with rc=2: Preventing STONED from re-starting on ctdb-2
ptest[3902]: 2011/01/12_16:26:13 WARN: unpack_rsc_op: Processing failed op 
STONED_start_0 on ctdb-2: invalid parameter (2)
ptest[3902]: 2011/01/12_16:26:13 notice: native_print: STONED   
(stonith:external/sbd): Started ctdb-2 FAILED
ptest[3902]: 2011/01/12_16:26:13 WARN: common_apply_stickiness: Forcing STONED 
away from ctdb-1 after 1000000 failures (max=1000000)
ptest[3902]: 2011/01/12_16:26:13 WARN: common_apply_stickiness: Forcing STONED 
away from ctdb-2 after 1000000 failures (max=1000000)
ptest[3902]: 2011/01/12_16:26:13 WARN: stage6: Scheduling Node ctdb-3 for 
STONITH
ptest[3902]: 2011/01/12_16:26:13 notice: LogActions: Stop resource STONED       
(ctdb-2)


ctdb-1:~# ocf-tester -n STONED /usr/lib/stonith/plugins/external/sbd 
Beginning tests for /usr/lib/stonith/plugins/external/sbd...
* rc=1: Your agent has too restrictive permissions: should be 755
-:1: parser error : Document is empty

^
-:1: parser error : Start tag expected, '<' not found

^
I/O error : Invalid seek
* rc=1: Your agent produces meta-data which does not conform to ra-api-1.dtd
* rc=1: The meta-data action cannot fail and must return 0
* rc=1: Validation failed.  Did you supply enough options with -o ?
Aborting tests


What am I doing wrong ?


kind regrds,

---

Janosh
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to