On Friday 14 October 2016 04:00 AM, Andy Zhou wrote:
Done. Now it shows the following.
[root@h2 ovs]# crm configure show
node1: h1 \
attributes
node2: h2
primitiveClusterIP IPaddr2 \
paramsip=10.33.75.200cidr_netmask=32\
opstart interval=0stimeout=20s\
opstop interval=0stimeout=20s\
opmonitor interval=30s
primitiveovndb ocf:ovn:ovndb-servers \
opstart interval=0stimeout=30s\
opstop interval=0stimeout=20s\
oppromote interval=0stimeout=50s\
opdemote interval=0stimeout=50s\
opmonitor interval=1min\
meta
msovndb-master ovndb\
metanotify=true
colocationcolocation-ovndb-master-ClusterIP-INFINITY inf:
ovndb-master:Started ClusterIP:Master
orderorder-ClusterIP-ovndb-master-mandatory inf: ClusterIP:start
ovndb-master:start
propertycib-bootstrap-options: \
have-watchdog=false\
dc-version=1.1.13-10.el7_2.4-44eb2dd\
cluster-infrastructure=corosync\
cluster-name=mycluster\
stonith-enabled=false
propertyovn_ovsdb_master_server: \
My installation does not have ocf-tester, There is a program called
ocft with a test option. Not sure if this is a suitable replacement.
If not, how could I get
the ocf-tester program? I ran the ocft program and get the following
output. Not sure what it means.
[root@h2 ovs]# ocft test -n test-ovndb -o master_ip 10.0.0.1
/usr/share/openvswitch/scripts/ovndb-servers.ocf
ERROR: cases directory not found.
I have attached ocf-tester with this mail. I guess it's a standalone
script. If it does not work, I think it's better not to attempt anymore
as we have another way to find out.
Alternately, you can check if the ovsdb servers are started
properly by running
/usr/share/openvswitch/scripts/ovn-ctl --db-sb-sync-from=10.0.0.1
--db-nb-sync-from=10.0.0.1 start_ovsdb
The output are as follows. Should we use --db-sb-sync-from-addr instead?
[root@h2 ovs]# /usr/share/openvswitch/scripts/ovn-ctl
--db-sb-sync-from=10.0.0.1 --db-nb-sync-from=10.0.0.1 start_ovsdb
/usr/share/openvswitch/scripts/ovn-ctl: unknown option
"--db-sb-sync-from=10.0.0.1" (use --help for help)
/usr/share/openvswitch/scripts/ovn-ctl: unknown option
"--db-nb-sync-from=10.0.0.1" (use --help for help)
'ovn-ctl' runs without any error message after I fixed the command
line parameter.
I am sorry for the misinformation, Andy. What you ran is correct. Could
you check the status of the ovsdb servers in h2 after you run the above
command using the following commands.
ovn-ctl status_ovnnb
ovn-ctl status_ovnsb
Both the above commands should return "running/backup". If you see in
the OCF script in the function ovsdb_server_start(), we wait
indefinitely till the DB servers are started. Since the 'start' action
on h2 times out, I doubt that the servers are not started properly.
I think this is mostly due to the crm configuration. Once you add
the 'ms' and 'colocation' resources, you should be able to
overcome this problem.
No, ovndb still failed to launch on h2.
[root@h2 ovs]# crm status
Last updated: Thu Oct 13 11:27:42 2016Last change: Thu Oct 13 11:17:25
2016 by root via cibadmin on h2
Stack: corosync
Current DC: h2 (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum
2 nodes and 3 resources configured
*Online*: [ h1 h2 ]
Full list of resources:
*ClusterIP*(ocf::heartbeat:IPaddr2):*Started*h1
*Master*/*Slave*Set: ovndb-*master*[ovndb]
*Master*s: [ h1 ]
*Stopped*: [ h2 ]
Failed Actions:
* ovndb_start_0 on h2 '*unknown error*' (1): call=39, status=*Timed
Out*, exitreason='none',
last-rc-change='Wed Oct 12 14:43:03 2016', queued=0ms, exec=30003
I have never tried colocating two resources with the ClusterIP
resource. Just for testing, is it possible to drop the WebServer
resource?
Done. It did not make any difference that I can see.
#!/bin/sh
#
# $Id: ocf-tester,v 1.2 2006/08/14 09:38:20 andrew Exp $
#
# Copyright (c) 2006 Novell Inc, Andrew Beekhof
# All Rights Reserved.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of version 2 of the GNU General Public License as
# published by the Free Software Foundation.
#
# This program is distributed in the hope that it would be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# Further, this software is distributed without any warranty that it is
# free of the rightful claim of any third person regarding infringement
# or the like. Any license provided herein, whether implied or
# otherwise, applies only to this software file. Patent licenses, if
# any, provided herein do not apply to combinations of this program with
# other software, or any other product whatsoever.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write the Free Software Foundation,
# Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
#
LRMD=/usr/lib/heartbeat/lrmd
LRMADMIN=/usr/sbin/lrmadmin
DATADIR=/usr/share
METADATA_LINT="xmllint --noout --valid -"
# set some common meta attributes, which are expected to be
# present by resource agents
export OCF_RESKEY_CRM_meta_timeout=20000 # 20 seconds timeout
export OCF_RESKEY_CRM_meta_interval=10000 # reset this for probes
num_errors=0
info() {
[ "$quiet" -eq 1 ] && return
echo "$*"
}
debug() {
[ "$verbose" -eq 0 ] && return
echo "$*"
}
usage() {
# make sure to output errors on stderr
[ "x$1" = "x0" ] || exec >&2
echo "Tool for testing if a cluster resource is OCF compliant"
echo ""
echo "Usage: ocf-tester [-Lh] -n resource_name [-o name=value]*
/full/path/to/resource/agent"
echo ""
echo "Options:"
echo " -h This text"
echo " -v Be verbose while testing"
echo " -q Be quiet while testing"
echo " -d Turn on RA debugging"
echo " -n name Name of the resource"
echo " -o name=value Name and value of any parameters
required by the agent"
echo " -L Use lrmadmin/lrmd for tests"
exit $1
}
assert() {
rc=$1; shift
target=$1; shift
msg=$1; shift
local targetrc matched
if [ $# = 0 ]; then
exit_code=0
else
exit_code=$1; shift
fi
for targetrc in `echo $target | tr ':' ' '`; do
[ $rc -eq $targetrc ] && matched=1
done
if [ "$matched" != 1 ]; then
num_errors=`expr $num_errors + 1`
echo "* rc=$rc: $msg"
if [ $exit_code != 0 ]; then
[ -n "$command_output" ] && cat<<EOF
$command_output
EOF
echo "Aborting tests"
exit $exit_code
fi
fi
command_output=""
}
done=0
ra_args=""
verbose=0
quiet=0
while test "$done" = "0"; do
case "$1" in
-n) OCF_RESOURCE_INSTANCE=$2; ra_args="$ra_args
OCF_RESOURCE_INSTANCE=$2"; shift; shift;;
-o) name=${2%%=*}; value=${2#*=};
lrm_ra_args="$lrm_ra_args $2";
ra_args="$ra_args OCF_RESKEY_$name='$value'"; shift; shift;;
-L) use_lrmd=1; shift;;
-v) verbose=1; shift;;
-d) export HA_debug=1; shift;;
-q) quiet=1; shift;;
-?|--help) usage 0;;
--version) echo "UNKNOWN"; exit 0;;
-*) echo "unknown option: $1" >&2; usage 1;;
*) done=1;;
esac
done
if [ "x" = "x$OCF_ROOT" ]; then
if [ -d /usr/lib/ocf ]; then
export OCF_ROOT=/usr/lib/ocf
else
echo "You must supply the location of OCF_ROOT (common location is
/usr/lib/ocf)" >&2
usage 1
fi
fi
if [ "x" = "x$OCF_RESOURCE_INSTANCE" ]; then
echo "You must give your resource a name, set OCF_RESOURCE_INSTANCE" >&2
usage 1
fi
agent=$1
if [ ! -e $agent ]; then
echo "You must provide the full path to your resource agent" >&2
usage 1
fi
installed_rc=5
stopped_rc=7
has_demote=1
has_promote=1
start_lrmd() {
lrmd_timeout=0
lrmd_interval=0
lrmd_target_rc=EVERYTIME
lrmd_started=""
$LRMD -s 2>/dev/null
rc=$?
if [ $rc -eq 3 ]; then
lrmd_started=1
$LRMD &
sleep 1
$LRMD -s 2>/dev/null
else
return $rc
fi
}
add_resource() {
$LRMADMIN -A $OCF_RESOURCE_INSTANCE \
ocf \
`basename $agent` \
$(basename `dirname $agent`) \
$lrm_ra_args > /dev/null
}
del_resource() {
$LRMADMIN -D $OCF_RESOURCE_INSTANCE
}
parse_lrmadmin_output() {
awk '
BEGIN{ rc=1; }
/Waiting for lrmd to callback.../ { n=1; next; }
n==1 && /----------------operation--------------/ { n++; next; }
n==2 && /return code:/ { rc=$0; sub("return code: *","",rc); next }
n==2 && /---------------------------------------/ {
n++;
next;
}
END{
if( n!=3 ) exit 1;
else exit rc;
}
'
}
exec_resource() {
op="$1"
args="$2"
$LRMADMIN -E $OCF_RESOURCE_INSTANCE \
$op $lrmd_timeout $lrmd_interval \
$lrmd_target_rc \
$args | parse_lrmadmin_output
}
if [ "$use_lrmd" = 1 ]; then
echo "Using lrmd/lrmadmin for all tests"
start_lrmd || {
echo "could not start lrmd" >&2
exit 1
}
trap '
[ "$lrmd_started" = 1 ] && $LRMD -k
' EXIT
add_resource || {
echo "failed to add resource to lrmd" >&2
exit 1
}
fi
lrm_test_command() {
action="$1"
msg="$2"
debug "$msg"
exec_resource $action "$lrm_ra_args"
}
test_permissions() {
action=meta-data
debug ${1:-"Testing permissions with uid nobody"}
su nobody -s /bin/sh $agent $action > /dev/null
}
test_metadata() {
action=meta-data
msg=${1:-"Testing: $action"}
debug $msg
bash $agent $action | (cd $DATADIR/resource-agents && $METADATA_LINT)
rc=$?
#echo rc: $rc
return $rc
}
test_command() {
action=$1; shift
export __OCF_ACTION=$action
msg=${1:-"Testing: $action"}
if [ "$use_lrmd" = 1 ]; then
lrm_test_command $action "$msg"
return $?
fi
#echo Running: "export $ra_args; bash $agent $action 2>&1 > /dev/null"
if [ $verbose -eq 0 ]; then
command_output=`bash $agent $action 2>&1`
else
debug $msg
bash $agent $action
fi
rc=$?
#echo rc: $rc
return $rc
}
# Begin tests
info "Beginning tests for $agent..."
if [ ! -f $agent ]; then
assert 7 0 "Could not find file: $agent"
fi
if [ `id -u` = 0 ]; then
test_permissions
assert $? 0 "Your agent has too restrictive permissions: should be 755"
else
echo "WARN: Can't check agent's permissions because we're not root;
they should be 755"
fi
test_metadata
assert $? 0 "Your agent produces meta-data which does not conform to
ra-api-1.dtd"
OCF_TESTER_FAIL_HAVE_BINARY=1
export OCF_TESTER_FAIL_HAVE_BINARY
test_command meta-data
rc=$?
if [ $rc -eq 3 ]; then
assert $rc 0 "Your agent does not support the meta-data action"
else
assert $rc 0 "The meta-data action cannot fail and must return 0"
fi
unset OCF_TESTER_FAIL_HAVE_BINARY
ra_args="export $ra_args"
eval $ra_args
test_command validate-all
rc=$?
if [ $rc -eq 3 ]; then
assert $rc 0 "Your agent does not support the validate-all action"
elif [ $rc -ne 0 ]; then
assert $rc 0 "Validation failed. Did you supply enough options with -o ?" 1
usage $rc
fi
test_command monitor "Checking current state"
rc=$?
if [ $rc -eq 3 ]; then
assert $rc 7 "Your agent does not support the monitor action" 1
elif [ $rc -eq 8 ]; then
test_command demote "Cleanup, demote"
assert $? 0 "Your agent was a master and could not be demoted" 1
test_command stop "Cleanup, stop"
assert $? 0 "Your agent was a master and could not be stopped" 1
elif [ $rc -ne 7 ]; then
test_command stop
assert $? 0 "Your agent was active and could not be stopped" 1
fi
test_command monitor
assert $? $stopped_rc "Monitoring a stopped resource should return $stopped_rc"
OCF_TESTER_FAIL_HAVE_BINARY=1
export OCF_TESTER_FAIL_HAVE_BINARY
OCF_RESKEY_CRM_meta_interval=0
test_command monitor
assert $? $stopped_rc:$installed_rc "The initial probe for a stopped resource
should return $stopped_rc or $installed_rc even if all binaries are missing"
unset OCF_TESTER_FAIL_HAVE_BINARY
OCF_RESKEY_CRM_meta_interval=20000
test_command start
assert $? 0 "Start failed. Did you supply enough options with -o ?" 1
test_command monitor
assert $? 0 "Monitoring an active resource should return 0"
OCF_RESKEY_CRM_meta_interval=0
test_command monitor
assert $? 0 "Probing an active resource should return 0"
OCF_RESKEY_CRM_meta_interval=20000
test_command notify
rc=$?
if [ $rc -eq 3 ]; then
info "* Your agent does not support the notify action (optional)"
else
assert $rc 0 "The notify action cannot fail and must return 0"
fi
test_command demote "Checking for demote action"
if [ $? -eq 3 ]; then
has_demote=0
info "* Your agent does not support the demote action (optional)"
fi
test_command promote "Checking for promote action"
if [ $? -eq 3 ]; then
has_promote=0
info "* Your agent does not support the promote action (optional)"
fi
if [ $has_promote -eq 1 -a $has_demote -eq 1 ]; then
test_command demote "Testing: demotion of started resource"
assert $? 0 "Demoting a start resource should not fail"
test_command promote
assert $? 0 "Promote failed"
test_command demote
assert $? 0 "Demote failed" 1
test_command demote "Testing: demotion of demoted resource"
assert $? 0 "Demoting a demoted resource should not fail"
test_command promote "Promoting resource"
assert $? 0 "Promote failed" 1
test_command promote "Testing: promotion of promoted resource"
assert $? 0 "Promoting a promoted resource should not fail"
test_command demote "Demoting resource"
assert $? 0 "Demote failed" 1
elif [ $has_promote -eq 0 -a $has_demote -eq 0 ]; then
info "* Your agent does not support master/slave (optional)"
else
echo "* Your agent partially supports master/slave"
num_errors=`expr $num_errors + 1`
fi
test_command stop
assert $? 0 "Stop failed" 1
test_command monitor
assert $? $stopped_rc "Monitoring a stopped resource should return $stopped_rc"
test_command start "Restarting resource..."
assert $? 0 "Start failed" 1
test_command monitor
assert $? 0 "Monitoring an active resource should return 0"
test_command start "Testing: starting a started resource"
assert $? 0 "Starting a running resource is required to succeed"
test_command monitor
assert $? 0 "Monitoring an active resource should return 0"
test_command stop "Stopping resource"
assert $? 0 "Stop could not clean up after multiple starts" 1
test_command monitor
assert $? $stopped_rc "Monitoring a stopped resource should return $stopped_rc"
test_command stop "Testing: stopping a stopped resource"
assert $? 0 "Stopping a stopped resource is required to succeed"
test_command monitor
assert $? $stopped_rc "Monitoring a stopped resource should return $stopped_rc"
test_command migrate_to "Checking for migrate_to action"
rc=$?
if [ $rc -ne 3 ]; then
test_command migrate_from "Checking for migrate_from action"
fi
if [ $? -eq 3 ]; then
info "* Your agent does not support the migrate action (optional)"
fi
test_command reload "Checking for reload action"
if [ $? -eq 3 ]; then
info "* Your agent does not support the reload action (optional)"
fi
if [ $num_errors -gt 0 ]; then
echo "Tests failed: $agent failed $num_errors tests" >&2
exit 1
else
echo $agent passed all tests
exit 0
fi
# vim:et:ts=8:sw=4
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev