Hi Dejan,
I'm sorry I'm late.
I took your advice, and corrected a plug-in, so it's sent.
By the way, Holger made a similar plug-in, but how should I do?
Best regards,
Yasumasa OZAKI
(2010/09/30 19:38), Yasumasa OZAKI wrote:
Hi Dejan,
Thank you for reply.
I correct it based on your opinion because I think that your opinion is
correct. wait a little, please.
- The stop of the domain is judged from the virsh returns, and
CheckIfDead function is removed.
- RunCommand function does, and shortens the refactoring.
- Other points are corrected based on your opinion.
Thanks,
Yasumasa OZAKI
On Thu, 23 Sep 2010 17:08:12 +0200, Dejan Muhamedagic wrote:
Hi Yasumasa-san,
On Fri, Sep 10, 2010 at 03:17:56PM +0900, Yasumasa OZAKI wrote:
Hi,
I made the STONITH plug-in that can be used by both Xen and KVM.
Thanks! Comments below.
I would like to hear any opinion.
Best regards,
Yasumasa OZAKI
#!/bin/sh
#
# External STONITH module for Xen/KVM hypervisor through ssh.
# Uses Xen/KVM hypervisor as a STONITH device to control guest.
#
# Copyright (c) 2010 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
#
STOP_COMMAND="virsh destroy"
START_COMMAND="virsh start"
DUMP_COMMAND="virsh dump"
SSH_COMMAND="/usr/bin/ssh -q -x -n"
# Rewrite the hostlist to accept "," as a delimeter for hostnames too.
hostlist=`echo $hostlist | tr ',' ' '`
CheckIfDead() {
for j in 1 2 3 4 5
do
if ! ping -w1 -c1 "$1">/dev/null 2>&1
then
return 0
fi
sleep 1
done
return 1
}
I'd rather have this one removed. If virsh returns a meaningful
exit code, then we can rely on that, right? Or use domstate?
CheckHostList() {
if [ "x" = "x$hostlist" ]
then
ha_log.sh err "hostlist isn't set"
exit 1
fi
}
CheckHypervisor() {
if [ "x" = "x$hypervisor" ]
then
ha_log.sh err "hypervisor isn't set"
exit 1
fi
}
RunCommand() {
CheckHostList
CheckHypervisor
for node in $hostlist
do
if [ "$node" != "$1" ]
then
continue
fi
You can abbreviate code like this into a more compact
[ "$node" != "$1" ]&& continue
case $2 in
stop)
if [ "x$run_dump" != "x" ]
then
#Need to run core dump
if [ "x$dump_dir" != "x" ]
then
TIMESTAMP=`date +%Y-%m%d-%H%M.%S`
DOMAINNAME=`printf "%s" $node`
COREFILE=$dump_dir/$TIMESTAMP-$DOMAINNAME.core
#Run core dump
command_result="$($SSH_COMMAND $hypervisor " \
pgrep -f ^virsh_$node>/dev/null 2>&1; \
if [ \$? = 0 ]; then echo RUNNING; exit; fi; \
mkdir -p $dump_dir; \
if [ \$? != 0 ]; then echo MKDIR_FAILED; exit; fi; \
(exec -a virsh_$node $DUMP_COMMAND $node $COREFILE>/dev/null 2>&1); \
if [ \$? != 0 ]; then echo DUMP_FAILED; exit; fi; \
echo OK")"
ssh_result=$?
if [ $ssh_result = 0 ]
then
case "$command_result" in
RUNNING)
ha_log.sh info "Dump is already running"
exit 0
;;
SUSPEND_FAILED)
ha_log.sh err "Failed to suspend domain
$node"
;;
MKDIR_FAILED)
ha_log.sh err "Failed to create directory
$dump_dir"
;;
DUMP_FAILED)
ha_log.sh err "Failed to core dump domain $node
to $COREFILE"
;;
OK)
ha_log.sh notice "Domain $node dumped to
$COREFILE"
;;
esac
else
ha_log.sh err "Couldn't connect to hypervisor
$hypervisor"
fi
else
ha_log.sh err "dump_dir isn't set"
fi
fi
command_result=$($SSH_COMMAND $hypervisor "((sleep 2; $STOP_COMMAND
$node)>/dev/null 2>&1&); echo \$?")
ssh_result=$?
if [ $ssh_result = 0 ]
then
if [ $command_result = 0 ]
then
ha_log.sh notice "Domain $node is stoped"
else
ha_log.sh err "Failed to stop domain $node"
fi
else
ha_log.sh err "Couldn't connect to hypervisor $hypervisor"
fi
break;;
start)
command_result=$($SSH_COMMAND $hypervisor "((sleep 2; $START_COMMAND
$node)>/dev/null 2>&1&); echo \$?")
ssh_result=$?
if [ $ssh_result = 0 ]
then
if [ $command_result = 0 ]
then
ha_log.sh notice "Domain $node is started"
else
ha_log.sh err "Failed to start domain $node"
fi
else
ha_log.sh err "Couldn't connect to hypervisor $hypervisor"
fi
break;;
esac
exit 0
done
}
This function is too long. Can you try to refactor and split it
into several. Also, the stop and start good candidates to be
folded into one function.
# Main code
case $1 in
gethosts)
CheckHostList
for node in $hostlist ; do
echo $node
done
exit 0
;;
on)
RunCommand $2 start
exit $?
;;
off)
if RunCommand $2 stop
then
if CheckIfDead $2
then
exit 0
fi
fi
exit 1
;;
reset)
RunCommand $2 stop
if CheckIfDead $2
then
RunCommand $2 start
exit 0
fi
exit 1
;;
status)
exit 0
;;
getconfignames)
echo "hostlist hypervisor"
exit 0
;;
getinfo-devid)
echo "virsh STONITH device"
exit 0
;;
getinfo-devname)
echo "virsh STONITH external device"
exit 0
;;
getinfo-devdescr)
echo "ssh-based Linux host reset for Xen/KVM guest domain trough
hypervisor"
echo "Fine for testing, but not really suitable for production!"
Well, it could be used in production too, if you replace the
ping part.
exit 0
;;
getinfo-devurl)
echo "http://openssh.org http://www.xensource.com/
http://linux-ha.org/wiki"
I think that this should be reduced to just xensource.
exit 0
;;
getinfo-xml)
cat<< SSHXML
<parameters>
<parameter name="hostlist" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
Hostlist
</shortdesc>
<longdesc lang="en">
The list of controlled nodes.
For example: "node1 node2"
</longdesc>
</parameter>
<parameter name="hypervisor" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
Hypervisor hostname
</shortdesc>
<longdesc lang="en">
Host name to execute hypervisor. Root user shall be able to ssh to that node.
Using public key authentication. This may also be a security
issue, but let's leave that to users to decide.
</longdesc>
</parameter>
<parameter name="run_dump" unique="0" required="0">
<content type="string" />
<shortdesc lang="en">
Run core dump
</shortdesc>
<longdesc lang="en">
If set plugin will call "virsh dump" before killing guest domain
</longdesc>
</parameter>
<parameter name="dump_dir" unique="1" required="0">
<content type="string" />
<shortdesc lang="en">
Run dump core with the specified directory
When the "run_dump" parameter is set, this parameter is indispensable
</shortdesc>
<longdesc lang="en">
This parameter can indicate the dump destination.
Should be set as a full path format, ex.) "/var/log/dump"
The above example would dump the core, like;
/var/log/dump/2009-0316-1403.37-GuestDomain.core
</longdesc>
</parameter>
</parameters>
SSHXML
exit 0
;;
*)
exit 1
;;
esac
Cheers,
Dejan
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
#!/bin/sh
#
# External STONITH module for Xen/KVM hypervisor through ssh.
# Uses Xen/KVM hypervisor as a STONITH device to control guest.
#
# Copyright (c) 2010 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
#
STOP_COMMAND="virsh destroy"
START_COMMAND="virsh start"
DUMP_COMMAND="virsh dump"
SSH_COMMAND="/usr/bin/ssh -q -x -n"
# Rewrite the hostlist to accept "," as a delimeter for hostnames too.
hostlist=`echo $hostlist | tr ',' ' '`
CheckHostList() {
if [ "x" = "x$hostlist" ]
then
ha_log.sh err "hostlist isn't set"
exit 1
fi
}
CheckHypervisor() {
if [ "x" = "x$hypervisor" ]
then
ha_log.sh err "hypervisor isn't set"
exit 1
fi
}
DumpNode() {
if [ "x$dump_dir" = "x" ]
then
ha_log.sh err "dump_dir isn't set"
return 1
fi
TIMESTAMP=`date +%Y-%m%d-%H%M.%S`
DOMAINNAME=`printf "%s" $1`
COREFILE=$dump_dir/$TIMESTAMP-$DOMAINNAME.core
command_result="$($SSH_COMMAND $2 " \
pgrep -f ^virsh_$1 >/dev/null 2>&1; \
if [ \$? = 0 ]; then echo RUNNING; exit; fi; \
mkdir -p $dump_dir >/dev/null 2>&1; \
if [ \$? != 0 ]; then echo MKDIR_FAILED; exit; fi; \
(exec -a virsh_$1 $DUMP_COMMAND $1 $COREFILE >/dev/null 2>&1); \
if [ \$? != 0 ]; then echo DUMP_FAILED; exit; fi; \
echo OK \
")"
ssh_result=$?
ha_log.sh debug "\$ssh_result=$ssh_result"
case $ssh_result in
0)
case $command_result in
RUNNING)
ha_log.sh info "Dump is already running"
;;
MKDIR_FAILED)
ha_log.sh err "Failed to create directory $dump_dir"
return 1
;;
DUMP_FAILED)
ha_log.sh err "Failed to core dump domain $1 to $COREFILE"
return 1
;;
OK)
ha_log.sh notice "Domain $1 dumped to $COREFILE"
;;
esac
;;
255)
ha_log.sh err "Couldn't connect to hypervisor $2"
return 1
;;
*)
ha_log.sh err "Failed to core dump domain $1 to $COREFILE"
return 1
;;
esac
}
StopNode() {
$SSH_COMMAND $2 "$STOP_COMMAND $1 >/dev/null 2>&1"
ssh_result=$?
ha_log.sh debug "\$ssh_result=$ssh_result"
case $ssh_result in
0)
ha_log.sh notice "Domain $1 is stoped"
;;
255)
ha_log.sh err "Couldn't connect to hypervisor $2"
return 1
;;
*)
ha_log.sh err "Failed to stop domain $1"
return 1
;;
esac
}
StartNode() {
$SSH_COMMAND $2 "$START_COMMAND $1 >/dev/null 2>&1 &"
ssh_result=$?
ha_log.sh debug "\$ssh_result=$ssh_result"
case $ssh_result in
0)
ha_log.sh notice "Domain $1 is started"
;;
255)
ha_log.sh err "Couldn't connect to hypervisor $2"
return 1
;;
*)
ha_log.sh err "Failed to start domain $1"
return 1
;;
esac
}
RunCommand() {
CheckHostList
CheckHypervisor
for node in $hostlist
do
[ "$node" != "$1" ] && continue
ha_log.sh debug "Target domain is $node"
case $2 in
stop)
[ "x$run_dump" != "x" ] && DumpNode $node $hypervisor
StopNode $node $hypervisor || return 1
break
;;
start)
StartNode $node $hypervisor || return 1
break
;;
esac
done
}
# Main code
case $1 in
gethosts)
CheckHostList
for node in $hostlist ; do
echo $node
done
exit 0
;;
on)
RunCommand $2 start
exit $?
;;
off)
RunCommand $2 stop
exit $?
;;
reset)
RunCommand $2 stop && RunCommand $2 start
exit $?
;;
status)
exit 0
;;
getconfignames)
echo "hostlist hypervisor"
exit 0
;;
getinfo-devid)
echo "virsh STONITH device"
exit 0
;;
getinfo-devname)
echo "virsh STONITH external device"
exit 0
;;
getinfo-devdescr)
echo "ssh-based Linux host reset for Xen/KVM guest domain trough hypervisor"
echo "Fine for testing, but not really suitable for production!"
exit 0
;;
getinfo-devurl)
echo "http://openssh.org http://www.xensource.com/ http://linux-ha.org/wiki"
exit 0
;;
getinfo-xml)
cat << SSHXML
<parameters>
<parameter name="hostlist" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
Hostlist
</shortdesc>
<longdesc lang="en">
The list of controlled nodes.
For example: "node1 node2"
</longdesc>
</parameter>
<parameter name="hypervisor" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
Hypervisor hostname
</shortdesc>
<longdesc lang="en">
Host name to execute hypervisor. Root user shall be able to ssh to that node.
</longdesc>
</parameter>
<parameter name="run_dump" unique="0" required="0">
<content type="string" />
<shortdesc lang="en">
Run core dump
</shortdesc>
<longdesc lang="en">
If set plugin will call "virsh dump" before killing guest domain
</longdesc>
</parameter>
<parameter name="dump_dir" unique="1" required="0">
<content type="string" />
<shortdesc lang="en">
Run dump core with the specified directory
When the "run_dump" parameter is set, this parameter is indispensable
</shortdesc>
<longdesc lang="en">
This parameter can indicate the dump destination.
Should be set as a full path format, ex.) "/var/log/dump"
The above example would dump the core, like;
/var/log/dump/2009-0316-1403.37-GuestDomain.core
</longdesc>
</parameter>
</parameters>
SSHXML
exit 0
;;
*)
exit 1
;;
esac
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/