Hi Yasumasa-san, On Fri, Sep 10, 2010 at 03:17:56PM +0900, Yasumasa OZAKI wrote: > Hi, > > I made the STONITH plug-in that can be used by both Xen and KVM.
Thanks! Comments below. > I would like to hear any opinion. > > Best regards, > Yasumasa OZAKI > > > > #!/bin/sh > # > # External STONITH module for Xen/KVM hypervisor through ssh. > # Uses Xen/KVM hypervisor as a STONITH device to control guest. > # > # Copyright (c) 2010 NIPPON TELEGRAPH AND TELEPHONE CORPORATION > # > > STOP_COMMAND="virsh destroy" > START_COMMAND="virsh start" > DUMP_COMMAND="virsh dump" > SSH_COMMAND="/usr/bin/ssh -q -x -n" > > # Rewrite the hostlist to accept "," as a delimeter for hostnames too. > hostlist=`echo $hostlist | tr ',' ' '` > > CheckIfDead() { > for j in 1 2 3 4 5 > do > if ! ping -w1 -c1 "$1" >/dev/null 2>&1 > then > return 0 > fi > sleep 1 > done > > return 1 > } I'd rather have this one removed. If virsh returns a meaningful exit code, then we can rely on that, right? Or use domstate? > CheckHostList() { > if [ "x" = "x$hostlist" ] > then > ha_log.sh err "hostlist isn't set" > exit 1 > fi > } > > CheckHypervisor() { > if [ "x" = "x$hypervisor" ] > then > ha_log.sh err "hypervisor isn't set" > exit 1 > fi > } > > RunCommand() { > CheckHostList > CheckHypervisor > > for node in $hostlist > do > if [ "$node" != "$1" ] > then > continue > fi You can abbreviate code like this into a more compact [ "$node" != "$1" ] && continue > case $2 in > stop) > if [ "x$run_dump" != "x" ] > then > #Need to run core dump > if [ "x$dump_dir" != "x" ] > then > TIMESTAMP=`date +%Y-%m%d-%H%M.%S` > DOMAINNAME=`printf "%s" $node` > COREFILE=$dump_dir/$TIMESTAMP-$DOMAINNAME.core > #Run core dump > command_result="$($SSH_COMMAND $hypervisor " \ > pgrep -f ^virsh_$node >/dev/null 2>&1; \ > if [ \$? = 0 ]; then echo RUNNING; exit; fi; \ > mkdir -p $dump_dir; \ > if [ \$? != 0 ]; then echo MKDIR_FAILED; exit; fi; \ > (exec -a virsh_$node $DUMP_COMMAND $node $COREFILE >/dev/null 2>&1); \ > if [ \$? != 0 ]; then echo DUMP_FAILED; exit; fi; \ > echo OK")" > ssh_result=$? > if [ $ssh_result = 0 ] > then > case "$command_result" in > RUNNING) > ha_log.sh info "Dump is already running" > exit 0 > ;; > SUSPEND_FAILED) > ha_log.sh err "Failed to suspend domain > $node" > ;; > MKDIR_FAILED) > ha_log.sh err "Failed to create directory > $dump_dir" > ;; > DUMP_FAILED) > ha_log.sh err "Failed to core dump domain > $node to $COREFILE" > ;; > OK) > ha_log.sh notice "Domain $node dumped to > $COREFILE" > ;; > esac > else > ha_log.sh err "Couldn't connect to hypervisor > $hypervisor" > fi > else > ha_log.sh err "dump_dir isn't set" > fi > fi > > command_result=$($SSH_COMMAND $hypervisor "((sleep 2; > $STOP_COMMAND $node) >/dev/null 2>&1 &); echo \$?") > ssh_result=$? > if [ $ssh_result = 0 ] > then > if [ $command_result = 0 ] > then > ha_log.sh notice "Domain $node is stoped" > else > ha_log.sh err "Failed to stop domain $node" > fi > else > ha_log.sh err "Couldn't connect to hypervisor $hypervisor" > fi > break;; > start) > command_result=$($SSH_COMMAND $hypervisor "((sleep 2; > $START_COMMAND $node) >/dev/null 2>&1 &); echo \$?") > ssh_result=$? > if [ $ssh_result = 0 ] > then > if [ $command_result = 0 ] > then > ha_log.sh notice "Domain $node is started" > else > ha_log.sh err "Failed to start domain $node" > fi > else > ha_log.sh err "Couldn't connect to hypervisor $hypervisor" > fi > break;; > esac > exit 0 > done > } This function is too long. Can you try to refactor and split it into several. Also, the stop and start good candidates to be folded into one function. > # Main code > > case $1 in > gethosts) > CheckHostList > > for node in $hostlist ; do > echo $node > done > exit 0 > ;; > on) > RunCommand $2 start > exit $? > ;; > off) > if RunCommand $2 stop > then > if CheckIfDead $2 > then > exit 0 > fi > fi > > exit 1 > ;; > reset) > RunCommand $2 stop > > if CheckIfDead $2 > then > RunCommand $2 start > exit 0 > fi > > exit 1 > ;; > status) > exit 0 > ;; > getconfignames) > echo "hostlist hypervisor" > exit 0 > ;; > getinfo-devid) > echo "virsh STONITH device" > exit 0 > ;; > getinfo-devname) > echo "virsh STONITH external device" > exit 0 > ;; > getinfo-devdescr) > echo "ssh-based Linux host reset for Xen/KVM guest domain trough > hypervisor" > echo "Fine for testing, but not really suitable for production!" Well, it could be used in production too, if you replace the ping part. > exit 0 > ;; > getinfo-devurl) > echo "http://openssh.org http://www.xensource.com/ > http://linux-ha.org/wiki" I think that this should be reduced to just xensource. > exit 0 > ;; > getinfo-xml) > cat << SSHXML > <parameters> > <parameter name="hostlist" unique="1" required="1"> > <content type="string" /> > <shortdesc lang="en"> > Hostlist > </shortdesc> > <longdesc lang="en"> > The list of controlled nodes. > For example: "node1 node2" > </longdesc> > </parameter> > <parameter name="hypervisor" unique="1" required="1"> > <content type="string" /> > <shortdesc lang="en"> > Hypervisor hostname > </shortdesc> > <longdesc lang="en"> > Host name to execute hypervisor. Root user shall be able to ssh to that node. Using public key authentication. This may also be a security issue, but let's leave that to users to decide. > </longdesc> > </parameter> > <parameter name="run_dump" unique="0" required="0"> > <content type="string" /> > <shortdesc lang="en"> > Run core dump > </shortdesc> > <longdesc lang="en"> > If set plugin will call "virsh dump" before killing guest domain > </longdesc> > </parameter> > <parameter name="dump_dir" unique="1" required="0"> > <content type="string" /> > <shortdesc lang="en"> > Run dump core with the specified directory > When the "run_dump" parameter is set, this parameter is indispensable > </shortdesc> > <longdesc lang="en"> > This parameter can indicate the dump destination. > Should be set as a full path format, ex.) "/var/log/dump" > The above example would dump the core, like; > /var/log/dump/2009-0316-1403.37-GuestDomain.core > </longdesc> > </parameter> > </parameters> > SSHXML > exit 0 > ;; > *) > exit 1 > ;; > esac Cheers, Dejan _______________________________________________________ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/