Hi Yasumasa-san,

On Wed, Dec 22, 2010 at 05:58:34PM +0900, Yasumasa OZAKI wrote:
> Hi Dejan,
>
> I'm sorry I'm late.

NP.

> I took your advice, and corrected a plug-in, so it's sent.
>
> By the way, Holger made a similar plug-in, but how should I do?

Yes, Holger's plugin looks good to me. Your plugin uses ssh
explicitely whereas the other one relies on whatever protocol is
setup for libvirt (which can also be ssh, of course). There's
also the dump function in your plugin. I think that it would be
best if you implement that parameter for the Holger's plugin.

Cheers,

Dejan

> Best regards,
> Yasumasa OZAKI
>
> (2010/09/30 19:38), Yasumasa OZAKI wrote:
>> Hi Dejan,
>>
>> Thank you for reply.
>>
>> I correct it based on your opinion because I think that your opinion is
>> correct. wait a little, please.
>>
>> - The stop of the domain is judged from the virsh returns, and
>> CheckIfDead function is removed.
>>
>> - RunCommand function does, and shortens the refactoring.
>>
>> - Other points are corrected based on your opinion.
>>
>> Thanks,
>> Yasumasa OZAKI
>>
>> On Thu, 23 Sep 2010 17:08:12 +0200, Dejan Muhamedagic wrote:
>>> Hi Yasumasa-san,
>>>
>>> On Fri, Sep 10, 2010 at 03:17:56PM +0900, Yasumasa OZAKI wrote:
>>>> Hi,
>>>>
>>>> I made the STONITH plug-in that can be used by both Xen and KVM.
>>>
>>> Thanks! Comments below.
>>>
>>>> I would like to hear any opinion.
>>>>
>>>> Best regards,
>>>> Yasumasa OZAKI
>>>>
>>>>
>>>>
>>>
>>>> #!/bin/sh
>>>> #
>>>> # External STONITH module for Xen/KVM hypervisor through ssh.
>>>> # Uses Xen/KVM hypervisor as a STONITH device to control guest.
>>>> #
>>>> # Copyright (c) 2010 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
>>>> #
>>>>
>>>> STOP_COMMAND="virsh destroy"
>>>> START_COMMAND="virsh start"
>>>> DUMP_COMMAND="virsh dump"
>>>> SSH_COMMAND="/usr/bin/ssh -q -x -n"
>>>>
>>>> # Rewrite the hostlist to accept "," as a delimeter for hostnames too.
>>>> hostlist=`echo $hostlist | tr ',' ' '`
>>>>
>>>> CheckIfDead() {
>>>>      for j in 1 2 3 4 5
>>>>      do
>>>>          if ! ping -w1 -c1 "$1">/dev/null 2>&1
>>>>          then
>>>>              return 0
>>>>          fi
>>>>          sleep 1
>>>>      done
>>>>
>>>>      return 1
>>>> }
>>>
>>> I'd rather have this one removed. If virsh returns a meaningful
>>> exit code, then we can rely on that, right? Or use domstate?
>>>
>>>> CheckHostList() {
>>>>      if [ "x" = "x$hostlist" ]
>>>>      then
>>>>          ha_log.sh err "hostlist isn't set"
>>>>          exit 1
>>>>      fi
>>>> }
>>>>
>>>> CheckHypervisor() {
>>>>      if [ "x" = "x$hypervisor" ]
>>>>      then
>>>>          ha_log.sh err "hypervisor isn't set"
>>>>          exit 1
>>>>      fi
>>>> }
>>>>
>>>> RunCommand() {
>>>>      CheckHostList
>>>>      CheckHypervisor
>>>>
>>>>      for node in $hostlist
>>>>      do
>>>>          if [ "$node" != "$1" ]
>>>>          then
>>>>              continue
>>>>          fi
>>>
>>> You can abbreviate code like this into a more compact
>>>
>>>     [ "$node" != "$1" ]&&  continue
>>>
>>>>          case $2 in
>>>>              stop)
>>>>                  if [ "x$run_dump" != "x" ]
>>>>                  then
>>>>                      #Need to run core dump
>>>>                      if [ "x$dump_dir" != "x" ]
>>>>                      then
>>>>                          TIMESTAMP=`date +%Y-%m%d-%H%M.%S`
>>>>                          DOMAINNAME=`printf "%s" $node`
>>>>                          COREFILE=$dump_dir/$TIMESTAMP-$DOMAINNAME.core
>>>>                          #Run core dump
>>>>                          command_result="$($SSH_COMMAND $hypervisor " \
>>>> pgrep -f ^virsh_$node>/dev/null 2>&1; \
>>>> if [ \$? = 0 ]; then echo RUNNING; exit; fi; \
>>>> mkdir -p $dump_dir; \
>>>> if [ \$? != 0 ]; then echo MKDIR_FAILED; exit; fi; \
>>>> (exec -a virsh_$node $DUMP_COMMAND $node $COREFILE>/dev/null 2>&1); \
>>>> if [ \$? != 0 ]; then echo DUMP_FAILED; exit; fi; \
>>>> echo OK")"
>>>>                          ssh_result=$?
>>>>                          if [ $ssh_result = 0 ]
>>>>                          then
>>>>                              case "$command_result" in
>>>>                                  RUNNING)
>>>>                                      ha_log.sh info "Dump is already 
>>>> running"
>>>>                                      exit 0
>>>>                                      ;;
>>>>                                  SUSPEND_FAILED)
>>>>                                      ha_log.sh err "Failed to suspend 
>>>> domain $node"
>>>>                                      ;;
>>>>                                  MKDIR_FAILED)
>>>>                                      ha_log.sh err "Failed to create 
>>>> directory $dump_dir"
>>>>                                      ;;
>>>>                                  DUMP_FAILED)
>>>>                                      ha_log.sh err "Failed to core dump 
>>>> domain $node to $COREFILE"
>>>>                                      ;;
>>>>                                  OK)
>>>>                                      ha_log.sh notice "Domain $node dumped 
>>>> to $COREFILE"
>>>>                                      ;;
>>>>                              esac
>>>>                          else
>>>>                              ha_log.sh err "Couldn't connect to hypervisor 
>>>> $hypervisor"
>>>>                          fi
>>>>                      else
>>>>                          ha_log.sh err "dump_dir isn't set"
>>>>                      fi
>>>>                  fi
>>>>
>>>>                  command_result=$($SSH_COMMAND $hypervisor "((sleep 2; 
>>>> $STOP_COMMAND $node)>/dev/null 2>&1&); echo \$?")
>>>>                  ssh_result=$?
>>>>                  if [ $ssh_result = 0 ]
>>>>                  then
>>>>                      if [ $command_result = 0 ]
>>>>                      then
>>>>                          ha_log.sh notice "Domain $node is stoped"
>>>>                      else
>>>>                          ha_log.sh err "Failed to stop domain $node"
>>>>                      fi
>>>>                  else
>>>>                      ha_log.sh err "Couldn't connect to hypervisor 
>>>> $hypervisor"
>>>>                  fi
>>>>                  break;;
>>>>              start)
>>>>                  command_result=$($SSH_COMMAND $hypervisor "((sleep 2; 
>>>> $START_COMMAND $node)>/dev/null 2>&1&); echo \$?")
>>>>                  ssh_result=$?
>>>>                  if [ $ssh_result = 0 ]
>>>>                  then
>>>>                      if [ $command_result = 0 ]
>>>>                      then
>>>>                          ha_log.sh notice "Domain $node is started"
>>>>                      else
>>>>                          ha_log.sh err "Failed to start domain $node"
>>>>                      fi
>>>>                  else
>>>>                      ha_log.sh err "Couldn't connect to hypervisor 
>>>> $hypervisor"
>>>>                  fi
>>>>                  break;;
>>>>          esac
>>>>          exit 0
>>>>      done
>>>> }
>>>
>>> This function is too long. Can you try to refactor and split it
>>> into several. Also, the stop and start good candidates to be
>>> folded into one function.
>>>
>>>> # Main code
>>>>
>>>> case $1 in
>>>> gethosts)
>>>>      CheckHostList
>>>>
>>>>      for node in $hostlist ; do
>>>>          echo $node
>>>>      done
>>>>      exit 0
>>>>      ;;
>>>> on)
>>>>      RunCommand $2 start
>>>>      exit $?
>>>>      ;;
>>>> off)
>>>>      if RunCommand $2 stop
>>>>      then
>>>>          if CheckIfDead $2
>>>>          then
>>>>              exit 0
>>>>          fi
>>>>      fi
>>>>
>>>>      exit 1
>>>>      ;;
>>>> reset)
>>>>      RunCommand $2 stop
>>>>
>>>>      if CheckIfDead $2
>>>>      then
>>>>          RunCommand $2 start
>>>>          exit 0
>>>>      fi
>>>>
>>>>      exit 1
>>>>      ;;
>>>> status)
>>>>      exit 0
>>>>      ;;
>>>> getconfignames)
>>>>      echo "hostlist hypervisor"
>>>>      exit 0
>>>>      ;;
>>>> getinfo-devid)
>>>>      echo "virsh STONITH device"
>>>>      exit 0
>>>>      ;;
>>>> getinfo-devname)
>>>>      echo "virsh STONITH external device"
>>>>      exit 0
>>>>      ;;
>>>> getinfo-devdescr)
>>>>      echo "ssh-based Linux host reset for Xen/KVM guest domain trough 
>>>> hypervisor"
>>>>      echo "Fine for testing, but not really suitable for production!"
>>>
>>> Well, it could be used in production too, if you replace the
>>> ping part.
>>>
>>>>      exit 0
>>>>      ;;
>>>> getinfo-devurl)
>>>>      echo "http://openssh.org http://www.xensource.com/ 
>>>> http://linux-ha.org/wiki";
>>>
>>> I think that this should be reduced to just xensource.
>>>
>>>>      exit 0
>>>>      ;;
>>>> getinfo-xml)
>>>>      cat<<  SSHXML
>>>> <parameters>
>>>> <parameter name="hostlist" unique="1" required="1">
>>>> <content type="string" />
>>>> <shortdesc lang="en">
>>>> Hostlist
>>>> </shortdesc>
>>>> <longdesc lang="en">
>>>> The list of controlled nodes.
>>>> For example: "node1 node2"
>>>> </longdesc>
>>>> </parameter>
>>>> <parameter name="hypervisor" unique="1" required="1">
>>>> <content type="string" />
>>>> <shortdesc lang="en">
>>>> Hypervisor hostname
>>>> </shortdesc>
>>>> <longdesc lang="en">
>>>> Host name to execute hypervisor. Root user shall be able to ssh to that 
>>>> node.
>>>
>>> Using public key authentication. This may also be a security
>>> issue, but let's leave that to users to decide.
>>>
>>>> </longdesc>
>>>> </parameter>
>>>> <parameter name="run_dump" unique="0" required="0">
>>>> <content type="string" />
>>>> <shortdesc lang="en">
>>>> Run core dump
>>>> </shortdesc>
>>>> <longdesc lang="en">
>>>> If set plugin will call "virsh dump" before killing guest domain
>>>> </longdesc>
>>>> </parameter>
>>>> <parameter name="dump_dir" unique="1" required="0">
>>>> <content type="string" />
>>>> <shortdesc lang="en">
>>>> Run dump core with the specified directory
>>>> When the "run_dump" parameter is set, this parameter is indispensable
>>>> </shortdesc>
>>>> <longdesc lang="en">
>>>> This parameter can indicate the dump destination.
>>>> Should be set as a full path format, ex.) "/var/log/dump"
>>>> The above example would dump the core, like;
>>>> /var/log/dump/2009-0316-1403.37-GuestDomain.core
>>>> </longdesc>
>>>> </parameter>
>>>> </parameters>
>>>> SSHXML
>>>>      exit 0
>>>>      ;;
>>>> *)
>>>>      exit 1
>>>>      ;;
>>>> esac
>>>
>>> Cheers,
>>>
>>> Dejan
>>
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>>
>

> #!/bin/sh
> #
> # External STONITH module for Xen/KVM hypervisor through ssh.
> # Uses Xen/KVM hypervisor as a STONITH device to control guest.
> #
> # Copyright (c) 2010 NIPPON TELEGRAPH AND TELEPHONE CORPORATION
> #
> 
> STOP_COMMAND="virsh destroy"
> START_COMMAND="virsh start"
> DUMP_COMMAND="virsh dump"
> SSH_COMMAND="/usr/bin/ssh -q -x -n"
> 
> # Rewrite the hostlist to accept "," as a delimeter for hostnames too.
> hostlist=`echo $hostlist | tr ',' ' '`
> 
> CheckHostList() {
>     if [ "x" = "x$hostlist" ]
>     then
>         ha_log.sh err "hostlist isn't set"
>         exit 1
>     fi
> }
> 
> CheckHypervisor() {
>     if [ "x" = "x$hypervisor" ]
>     then
>         ha_log.sh err "hypervisor isn't set"
>         exit 1
>     fi
> }
> 
> DumpNode() {
>     if [ "x$dump_dir" = "x" ]
>     then
>         ha_log.sh err "dump_dir isn't set"
>         return 1
>     fi
> 
>     TIMESTAMP=`date +%Y-%m%d-%H%M.%S`
>     DOMAINNAME=`printf "%s" $1`
>     COREFILE=$dump_dir/$TIMESTAMP-$DOMAINNAME.core
> 
>     command_result="$($SSH_COMMAND $2 " \
> pgrep -f ^virsh_$1 >/dev/null 2>&1; \
> if [ \$? = 0 ]; then echo RUNNING; exit; fi; \
> mkdir -p $dump_dir >/dev/null 2>&1; \
> if [ \$? != 0 ]; then echo MKDIR_FAILED; exit; fi; \
> (exec -a virsh_$1 $DUMP_COMMAND $1 $COREFILE >/dev/null 2>&1); \
> if [ \$? != 0 ]; then echo DUMP_FAILED; exit; fi; \
> echo OK \
> ")"
>     ssh_result=$?
>     ha_log.sh debug "\$ssh_result=$ssh_result"
>     case $ssh_result in
>         0)
>             case $command_result in
>                 RUNNING)
>                     ha_log.sh info "Dump is already running"
>                     ;;
>                 MKDIR_FAILED)
>                     ha_log.sh err "Failed to create directory $dump_dir"
>                     return 1
>                     ;;
>                 DUMP_FAILED)
>                     ha_log.sh err "Failed to core dump domain $1 to $COREFILE"
>                     return 1
>                     ;;
>                 OK)
>                     ha_log.sh notice "Domain $1 dumped to $COREFILE"
>                     ;;
>             esac
>             ;;
>         255)
>             ha_log.sh err "Couldn't connect to hypervisor $2"
>             return 1
>             ;;
>         *)
>             ha_log.sh err "Failed to core dump domain $1 to $COREFILE"
>             return 1
>             ;;
>     esac
> }
> 
> StopNode() {
>     $SSH_COMMAND $2 "$STOP_COMMAND $1 >/dev/null 2>&1"
>     ssh_result=$?
>     ha_log.sh debug "\$ssh_result=$ssh_result"
>     case $ssh_result in
>         0)
>             ha_log.sh notice "Domain $1 is stoped"
>             ;;
>       255)
>             ha_log.sh err "Couldn't connect to hypervisor $2"
>             return 1
>             ;;
>         *)
>             ha_log.sh err "Failed to stop domain $1"
>             return 1
>             ;;
>     esac
> }
> 
> StartNode() {
>     $SSH_COMMAND $2 "$START_COMMAND $1 >/dev/null 2>&1 &"
>     ssh_result=$?
>     ha_log.sh debug "\$ssh_result=$ssh_result"
>     case $ssh_result in
>         0)
>             ha_log.sh notice "Domain $1 is started"
>             ;;
>       255)
>             ha_log.sh err "Couldn't connect to hypervisor $2"
>             return 1
>             ;;
>         *)
>             ha_log.sh err "Failed to start domain $1"
>             return 1
>             ;;
>     esac
> }
> 
> RunCommand() {
>     CheckHostList
>     CheckHypervisor
>     
>     for node in $hostlist
>     do
>         [ "$node" != "$1" ] && continue
>         ha_log.sh debug "Target domain is $node"
>           
>         case $2 in
>             stop)
>                 [ "x$run_dump" != "x" ] && DumpNode $node $hypervisor
>                 StopNode $node $hypervisor || return 1
>                 break
>                 ;;
>             start)
>                 StartNode $node $hypervisor || return 1
>                 break
>                 ;;
>         esac
>     done
> }
> 
> 
> # Main code
> 
> case $1 in
> gethosts)
>     CheckHostList
>     
>     for node in $hostlist ; do
>         echo $node
>     done
>     exit 0
>     ;;
> on)
>     RunCommand $2 start
>     exit $?
>     ;;
> off)
>     RunCommand $2 stop
>     exit $?
>     ;;
> reset)
>     RunCommand $2 stop && RunCommand $2 start
>     exit $?
>     ;;
> status)
>     exit 0
>     ;;
> getconfignames)
>     echo "hostlist hypervisor"
>     exit 0
>     ;;
> getinfo-devid)
>     echo "virsh STONITH device"
>     exit 0
>     ;;
> getinfo-devname)
>     echo "virsh STONITH external device"
>     exit 0
>     ;;
> getinfo-devdescr)
>     echo "ssh-based Linux host reset for Xen/KVM guest domain trough 
> hypervisor"
>     echo "Fine for testing, but not really suitable for production!"
>     exit 0
>     ;;
> getinfo-devurl)
>     echo "http://openssh.org http://www.xensource.com/ 
> http://linux-ha.org/wiki";
>     exit 0
>     ;;
> getinfo-xml)
>     cat << SSHXML
> <parameters>
> <parameter name="hostlist" unique="1" required="1">
> <content type="string" />
> <shortdesc lang="en">
> Hostlist
> </shortdesc>
> <longdesc lang="en">
> The list of controlled nodes.
> For example: "node1 node2"
> </longdesc>
> </parameter>
> <parameter name="hypervisor" unique="1" required="1">
> <content type="string" />
> <shortdesc lang="en">
> Hypervisor hostname
> </shortdesc>
> <longdesc lang="en">
> Host name to execute hypervisor. Root user shall be able to ssh to that node.
> </longdesc>
> </parameter>
> <parameter name="run_dump" unique="0" required="0">
> <content type="string" />
> <shortdesc lang="en">
> Run core dump
> </shortdesc>
> <longdesc lang="en">
> If set plugin will call "virsh dump" before killing guest domain
> </longdesc>
> </parameter>
> <parameter name="dump_dir" unique="1" required="0">
> <content type="string" />
> <shortdesc lang="en">
> Run dump core with the specified directory 
> When the "run_dump" parameter is set, this parameter is indispensable 
> </shortdesc>
> <longdesc lang="en">
> This parameter can indicate the dump destination.
> Should be set as a full path format, ex.) "/var/log/dump"
> The above example would dump the core, like;
> /var/log/dump/2009-0316-1403.37-GuestDomain.core
> </longdesc>
> </parameter>
> </parameters>
> SSHXML
>     exit 0
>     ;;
> *)
>     exit 1
>     ;;
> esac

> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to