On Sat, May 29, 2010 at 9:27 PM, Tom Limoncelli <[email protected]> wrote:
>
> Signed-off-by: Tom Limoncelli <[email protected]>
> ---
>  daemons/ganeti-watcher |   41 +++++++++++++++++++++++++++++++++++++++++
>  lib/utils.py           |   13 +++++++++++++
>  2 files changed, 54 insertions(+), 0 deletions(-)
>
> diff --git a/daemons/ganeti-watcher b/daemons/ganeti-watcher
> index 1f82db8..3665f03 100755
> --- a/daemons/ganeti-watcher
> +++ b/daemons/ganeti-watcher
> @@ -36,6 +36,7 @@ import sys
>  import time
>  import logging
>  from optparse import OptionParser
> +import urllib2

I don't see anything in this patch using this import.
Is it needed? Does the commit pass "make lint"?

>
>  from ganeti import utils
>  from ganeti import constants
> @@ -48,6 +49,7 @@ from ganeti import ssconf
>  from ganeti import bdev
>  from ganeti import hypervisor
>  from ganeti.confd import client as confd_client
> +from ganeti.rapi import client as rapi_client
>
>
>  MAXTRIES = 5
> @@ -595,6 +597,30 @@ def OpenStateFile(path):
>   return os.fdopen(statefile_fd, "w+")
>
>
> +def IsRAPIResponding(cluster):
> +  """Connects to RAPI port and does a simple test.
> +
> + �...@type cluster: string
> + �...@param cluster: hostname of the cluster to connect to.
> +

Should we call it "hostname" or "host" rather than cluster?
In the end we're not passing a "cluster" at all. :)

> +  Returns:
> +    True: basic tests passed.
> +    False: basic tests failed.

s/tests/test/ (you're doing only one)

> +  """
> +  ssl_config = rapi_client.CertAuthorityVerify(constants.RAPI_CERT_FILE)
> +  try:
> +    master_version = rapi_client.GanetiRapiClient(
> +        cluster,
> +        port=constants.DEFAULT_RAPI_PORT,
> +        config_ssl_verification=ssl_config,
> +        username="", password="").GetVersion()
> +  except:

Michael suggested to catch more specific exceptions.
I did as well. :)

> +    logging.warning("Could not open connection to RAPI")
> +    return False
> +  logging.debug("RAPI master_version is %s", master_version)
> +  return master_version == constants.RAPI_VERSION
> +

Fair enough, but should we really check? Do we care?
It's not like it can answer any other version. And in the future, if
it can, perhaps this code can work anyway.

> +
>  def ParseOptions():
>   """Parse the command line options.
>
> @@ -668,6 +694,21 @@ def main():
>       # we are on master now
>       utils.EnsureDaemon(constants.RAPI)
>
> +      # If RAPI isn't responding to queries, try one restart.
> +      logging.debug("Attempting to talk with RAPI")
> +      rapi_test_result = IsRAPIResponding("localhost")
> +      logging.debug("RAPI Result = %s", rapi_test_result)
> +      if rapi_test_result == None:

>>> False == None
False
 You want if not rapi_test_result:
Please stop rapi and test your code on both paths :)

> +        logging.warning("RAPI is running but did not speak.  Killing RAPI")

"Couldn't get an answer from the Ganeti RAPI daemon" perhaps?
(we're not *sure* it's running. we hope so, might have crashed?)

> +        utils.StopDaemon(constants.RAPI)
> +        utils.EnsureDaemon(constants.RAPI)
> +        logging.debug("Attempting to talk with RAPI")
> +        rapi_test_result = IsRAPIResponding("localhost")
> +        logging.debug("RAPI Result = %s", rapi_test_result)
> +        if rapi_test_result is None:

if not rapi_...:

Also, these three lines are the same, above.
Should we put the debugging line in IsRAPIResponding and just do

if IsRAPIResponding(...):

> +          logging.fatal("RAPI is not responding. Please investigate.")
> +      logging.debug("RAPI works. Result = %s", rapi_test_result)
> +
>       try:
>         watcher = Watcher(options, notepad)
>       except errors.ConfigurationError:
> diff --git a/lib/utils.py b/lib/utils.py
> index 7b93870..9b327d3 100644
> --- a/lib/utils.py
> +++ b/lib/utils.py
> @@ -2166,6 +2166,19 @@ def EnsureDaemon(name):
>   return True
>
>
> +def StopDaemon(name):
> +  """Stop a daemon.
> +
> +  """
> +  result = RunCmd([constants.DAEMON_UTIL, "stop", name])
> +  if result.failed:
> +    logging.error("Can't stop daemon '%s', failure %s, output: %s",
> +                  name, result.fail_reason, result.output)
> +    return False
> +
> +  return True
> +
> +
>  def WritePidFile(name):
>   """Write the current process pidfile.
>
> --
> 1.7.0.1
>
>

Thanks,

Guido

Reply via email to