Hi Dejan,

I'm sorry I'm late.

I took your advice, and corrected a plug-in, so it's sent.

By the way, Holger made a similar plug-in, but how should I do?

Best regards,
Yasumasa OZAKI

(2010/09/30 19:38), Yasumasa OZAKI wrote:
Hi Dejan,

Thank you for reply.

I correct it based on your opinion because I think that your opinion is
correct. wait a little, please.

- The stop of the domain is judged from the virsh returns, and
CheckIfDead function is removed.

- RunCommand function does, and shortens the refactoring.

- Other points are corrected based on your opinion.

Thanks,
Yasumasa OZAKI

On Thu, 23 Sep 2010 17:08:12 +0200, Dejan Muhamedagic wrote:
Hi Yasumasa-san,

On Fri, Sep 10, 2010 at 03:17:56PM +0900, Yasumasa OZAKI wrote:
Hi,

I made the STONITH plug-in that can be used by both Xen and KVM.

Thanks! Comments below.

I would like to hear any opinion.

Best regards,
Yasumasa OZAKI




#!/bin/sh
#
# External STONITH module for Xen/KVM hypervisor through ssh.
# Uses Xen/KVM hypervisor as a STONITH device to control guest.
#
# Copyright (c) 2010 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
#

STOP_COMMAND="virsh destroy"
START_COMMAND="virsh start"
DUMP_COMMAND="virsh dump"
SSH_COMMAND="/usr/bin/ssh -q -x -n"

# Rewrite the hostlist to accept "," as a delimeter for hostnames too.
hostlist=`echo $hostlist | tr ',' ' '`

CheckIfDead() {
     for j in 1 2 3 4 5
     do
         if ! ping -w1 -c1 "$1">/dev/null 2>&1
         then
             return 0
         fi
         sleep 1
     done

     return 1
}

I'd rather have this one removed. If virsh returns a meaningful
exit code, then we can rely on that, right? Or use domstate?

CheckHostList() {
     if [ "x" = "x$hostlist" ]
     then
         ha_log.sh err "hostlist isn't set"
         exit 1
     fi
}

CheckHypervisor() {
     if [ "x" = "x$hypervisor" ]
     then
         ha_log.sh err "hypervisor isn't set"
         exit 1
     fi
}

RunCommand() {
     CheckHostList
     CheckHypervisor

     for node in $hostlist
     do
         if [ "$node" != "$1" ]
         then
             continue
         fi

You can abbreviate code like this into a more compact

        [ "$node" != "$1" ]&&  continue

         case $2 in
             stop)
                 if [ "x$run_dump" != "x" ]
                 then
                     #Need to run core dump
                     if [ "x$dump_dir" != "x" ]
                     then
                         TIMESTAMP=`date +%Y-%m%d-%H%M.%S`
                         DOMAINNAME=`printf "%s" $node`
                         COREFILE=$dump_dir/$TIMESTAMP-$DOMAINNAME.core
                         #Run core dump
                         command_result="$($SSH_COMMAND $hypervisor " \
pgrep -f ^virsh_$node>/dev/null 2>&1; \
if [ \$? = 0 ]; then echo RUNNING; exit; fi; \
mkdir -p $dump_dir; \
if [ \$? != 0 ]; then echo MKDIR_FAILED; exit; fi; \
(exec -a virsh_$node $DUMP_COMMAND $node $COREFILE>/dev/null 2>&1); \
if [ \$? != 0 ]; then echo DUMP_FAILED; exit; fi; \
echo OK")"
                         ssh_result=$?
                         if [ $ssh_result = 0 ]
                         then
                             case "$command_result" in
                                 RUNNING)
                                     ha_log.sh info "Dump is already running"
                                     exit 0
                                     ;;
                                 SUSPEND_FAILED)
                                     ha_log.sh err "Failed to suspend domain 
$node"
                                     ;;
                                 MKDIR_FAILED)
                                     ha_log.sh err "Failed to create directory 
$dump_dir"
                                     ;;
                                 DUMP_FAILED)
                                     ha_log.sh err "Failed to core dump domain $node 
to $COREFILE"
                                     ;;
                                 OK)
                                     ha_log.sh notice "Domain $node dumped to 
$COREFILE"
                                     ;;
                             esac
                         else
                             ha_log.sh err "Couldn't connect to hypervisor 
$hypervisor"
                         fi
                     else
                         ha_log.sh err "dump_dir isn't set"
                     fi
                 fi

                 command_result=$($SSH_COMMAND $hypervisor "((sleep 2; $STOP_COMMAND 
$node)>/dev/null 2>&1&); echo \$?")
                 ssh_result=$?
                 if [ $ssh_result = 0 ]
                 then
                     if [ $command_result = 0 ]
                     then
                         ha_log.sh notice "Domain $node is stoped"
                     else
                         ha_log.sh err "Failed to stop domain $node"
                     fi
                 else
                     ha_log.sh err "Couldn't connect to hypervisor $hypervisor"
                 fi
                 break;;
             start)
                 command_result=$($SSH_COMMAND $hypervisor "((sleep 2; $START_COMMAND 
$node)>/dev/null 2>&1&); echo \$?")
                 ssh_result=$?
                 if [ $ssh_result = 0 ]
                 then
                     if [ $command_result = 0 ]
                     then
                         ha_log.sh notice "Domain $node is started"
                     else
                         ha_log.sh err "Failed to start domain $node"
                     fi
                 else
                     ha_log.sh err "Couldn't connect to hypervisor $hypervisor"
                 fi
                 break;;
         esac
         exit 0
     done
}

This function is too long. Can you try to refactor and split it
into several. Also, the stop and start good candidates to be
folded into one function.

# Main code

case $1 in
gethosts)
     CheckHostList

     for node in $hostlist ; do
         echo $node
     done
     exit 0
     ;;
on)
     RunCommand $2 start
     exit $?
     ;;
off)
     if RunCommand $2 stop
     then
         if CheckIfDead $2
         then
             exit 0
         fi
     fi

     exit 1
     ;;
reset)
     RunCommand $2 stop

     if CheckIfDead $2
     then
         RunCommand $2 start
         exit 0
     fi

     exit 1
     ;;
status)
     exit 0
     ;;
getconfignames)
     echo "hostlist hypervisor"
     exit 0
     ;;
getinfo-devid)
     echo "virsh STONITH device"
     exit 0
     ;;
getinfo-devname)
     echo "virsh STONITH external device"
     exit 0
     ;;
getinfo-devdescr)
     echo "ssh-based Linux host reset for Xen/KVM guest domain trough 
hypervisor"
     echo "Fine for testing, but not really suitable for production!"

Well, it could be used in production too, if you replace the
ping part.

     exit 0
     ;;
getinfo-devurl)
     echo "http://openssh.org http://www.xensource.com/ 
http://linux-ha.org/wiki";

I think that this should be reduced to just xensource.

     exit 0
     ;;
getinfo-xml)
     cat<<  SSHXML
<parameters>
<parameter name="hostlist" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
Hostlist
</shortdesc>
<longdesc lang="en">
The list of controlled nodes.
For example: "node1 node2"
</longdesc>
</parameter>
<parameter name="hypervisor" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
Hypervisor hostname
</shortdesc>
<longdesc lang="en">
Host name to execute hypervisor. Root user shall be able to ssh to that node.

Using public key authentication. This may also be a security
issue, but let's leave that to users to decide.

</longdesc>
</parameter>
<parameter name="run_dump" unique="0" required="0">
<content type="string" />
<shortdesc lang="en">
Run core dump
</shortdesc>
<longdesc lang="en">
If set plugin will call "virsh dump" before killing guest domain
</longdesc>
</parameter>
<parameter name="dump_dir" unique="1" required="0">
<content type="string" />
<shortdesc lang="en">
Run dump core with the specified directory
When the "run_dump" parameter is set, this parameter is indispensable
</shortdesc>
<longdesc lang="en">
This parameter can indicate the dump destination.
Should be set as a full path format, ex.) "/var/log/dump"
The above example would dump the core, like;
/var/log/dump/2009-0316-1403.37-GuestDomain.core
</longdesc>
</parameter>
</parameters>
SSHXML
     exit 0
     ;;
*)
     exit 1
     ;;
esac

Cheers,

Dejan

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


#!/bin/sh
#
# External STONITH module for Xen/KVM hypervisor through ssh.
# Uses Xen/KVM hypervisor as a STONITH device to control guest.
#
# Copyright (c) 2010 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
#

STOP_COMMAND="virsh destroy"
START_COMMAND="virsh start"
DUMP_COMMAND="virsh dump"
SSH_COMMAND="/usr/bin/ssh -q -x -n"

# Rewrite the hostlist to accept "," as a delimeter for hostnames too.
hostlist=`echo $hostlist | tr ',' ' '`

CheckHostList() {
    if [ "x" = "x$hostlist" ]
    then
        ha_log.sh err "hostlist isn't set"
        exit 1
    fi
}

CheckHypervisor() {
    if [ "x" = "x$hypervisor" ]
    then
        ha_log.sh err "hypervisor isn't set"
        exit 1
    fi
}

DumpNode() {
    if [ "x$dump_dir" = "x" ]
    then
        ha_log.sh err "dump_dir isn't set"
        return 1
    fi

    TIMESTAMP=`date +%Y-%m%d-%H%M.%S`
    DOMAINNAME=`printf "%s" $1`
    COREFILE=$dump_dir/$TIMESTAMP-$DOMAINNAME.core

    command_result="$($SSH_COMMAND $2 " \
pgrep -f ^virsh_$1 >/dev/null 2>&1; \
if [ \$? = 0 ]; then echo RUNNING; exit; fi; \
mkdir -p $dump_dir >/dev/null 2>&1; \
if [ \$? != 0 ]; then echo MKDIR_FAILED; exit; fi; \
(exec -a virsh_$1 $DUMP_COMMAND $1 $COREFILE >/dev/null 2>&1); \
if [ \$? != 0 ]; then echo DUMP_FAILED; exit; fi; \
echo OK \
")"
    ssh_result=$?
    ha_log.sh debug "\$ssh_result=$ssh_result"
    case $ssh_result in
        0)
            case $command_result in
                RUNNING)
                    ha_log.sh info "Dump is already running"
                    ;;
                MKDIR_FAILED)
                    ha_log.sh err "Failed to create directory $dump_dir"
                    return 1
                    ;;
                DUMP_FAILED)
                    ha_log.sh err "Failed to core dump domain $1 to $COREFILE"
                    return 1
                    ;;
                OK)
                    ha_log.sh notice "Domain $1 dumped to $COREFILE"
                    ;;
            esac
            ;;
        255)
            ha_log.sh err "Couldn't connect to hypervisor $2"
            return 1
            ;;
        *)
            ha_log.sh err "Failed to core dump domain $1 to $COREFILE"
            return 1
            ;;
    esac
}

StopNode() {
    $SSH_COMMAND $2 "$STOP_COMMAND $1 >/dev/null 2>&1"
    ssh_result=$?
    ha_log.sh debug "\$ssh_result=$ssh_result"
    case $ssh_result in
        0)
            ha_log.sh notice "Domain $1 is stoped"
            ;;
      255)
            ha_log.sh err "Couldn't connect to hypervisor $2"
            return 1
            ;;
        *)
            ha_log.sh err "Failed to stop domain $1"
            return 1
            ;;
    esac
}

StartNode() {
    $SSH_COMMAND $2 "$START_COMMAND $1 >/dev/null 2>&1 &"
    ssh_result=$?
    ha_log.sh debug "\$ssh_result=$ssh_result"
    case $ssh_result in
        0)
            ha_log.sh notice "Domain $1 is started"
            ;;
      255)
            ha_log.sh err "Couldn't connect to hypervisor $2"
            return 1
            ;;
        *)
            ha_log.sh err "Failed to start domain $1"
            return 1
            ;;
    esac
}

RunCommand() {
    CheckHostList
    CheckHypervisor
    
    for node in $hostlist
    do
        [ "$node" != "$1" ] && continue
        ha_log.sh debug "Target domain is $node"
          
        case $2 in
            stop)
                [ "x$run_dump" != "x" ] && DumpNode $node $hypervisor
                StopNode $node $hypervisor || return 1
                break
                ;;
            start)
                StartNode $node $hypervisor || return 1
                break
                ;;
        esac
    done
}


# Main code

case $1 in
gethosts)
    CheckHostList
    
    for node in $hostlist ; do
        echo $node
    done
    exit 0
    ;;
on)
    RunCommand $2 start
    exit $?
    ;;
off)
    RunCommand $2 stop
    exit $?
    ;;
reset)
    RunCommand $2 stop && RunCommand $2 start
    exit $?
    ;;
status)
    exit 0
    ;;
getconfignames)
    echo "hostlist hypervisor"
    exit 0
    ;;
getinfo-devid)
    echo "virsh STONITH device"
    exit 0
    ;;
getinfo-devname)
    echo "virsh STONITH external device"
    exit 0
    ;;
getinfo-devdescr)
    echo "ssh-based Linux host reset for Xen/KVM guest domain trough hypervisor"
    echo "Fine for testing, but not really suitable for production!"
    exit 0
    ;;
getinfo-devurl)
    echo "http://openssh.org http://www.xensource.com/ http://linux-ha.org/wiki";
    exit 0
    ;;
getinfo-xml)
    cat << SSHXML
<parameters>
<parameter name="hostlist" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
Hostlist
</shortdesc>
<longdesc lang="en">
The list of controlled nodes.
For example: "node1 node2"
</longdesc>
</parameter>
<parameter name="hypervisor" unique="1" required="1">
<content type="string" />
<shortdesc lang="en">
Hypervisor hostname
</shortdesc>
<longdesc lang="en">
Host name to execute hypervisor. Root user shall be able to ssh to that node.
</longdesc>
</parameter>
<parameter name="run_dump" unique="0" required="0">
<content type="string" />
<shortdesc lang="en">
Run core dump
</shortdesc>
<longdesc lang="en">
If set plugin will call "virsh dump" before killing guest domain
</longdesc>
</parameter>
<parameter name="dump_dir" unique="1" required="0">
<content type="string" />
<shortdesc lang="en">
Run dump core with the specified directory 
When the "run_dump" parameter is set, this parameter is indispensable 
</shortdesc>
<longdesc lang="en">
This parameter can indicate the dump destination.
Should be set as a full path format, ex.) "/var/log/dump"
The above example would dump the core, like;
/var/log/dump/2009-0316-1403.37-GuestDomain.core
</longdesc>
</parameter>
</parameters>
SSHXML
    exit 0
    ;;
*)
    exit 1
    ;;
esac
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to