On Thu, Jul 02, 2015 at 03:10:49PM +0200, 'Helga Velroyen' via ganeti-devel
wrote:
Per default, the RAPI daemon binds to 0.0.0.0 when being
started. This means it serves from any IP the machine is
configured for. This works well together with the watcher
which always polls the RAPI daemons on 127.0.0.1 and
restarts it when it is not reachable.
If a user decides to start the RAPI daemon with a particular
IP other than 127.0.0.1 (using the option -b, for example
set in /etc/default/ganeti), RAPI will only serve from that
IP and thus it will not be reachable from 127.0.0.1. Since
the watcher only polls on this IP, it will inevitably fail
to connect to the RAPI daemon and thus restart it every five
minutes.
To solve this, this patch adds an option --rapi-ip to the
watcher. Whenever -b of the RAPI daemon is set, the watcher
needs to be fed the same IP with --rapi-ip (which means
editing /etc/cron.d/ganeti for example). This is not optimal
regarding user experience (as it is easy to forget one of
the two places), but the alternative would be to make this
a ganeti configuration parameter which is fed to both, RAPI
daemon and watcher, but this would be significantly more
effort for this relatively rarely used feature.
Signed-off-by: Helga Velroyen <[email protected]>
---
lib/watcher/__init__.py | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/lib/watcher/__init__.py b/lib/watcher/__init__.py
index ceb8f88..0d51e7d 100644
--- a/lib/watcher/__init__.py
+++ b/lib/watcher/__init__.py
@@ -477,6 +477,9 @@ def ParseOptions():
parser.add_option("--no-wait-children", dest="wait_children",
action="store_false",
help="Don't wait for child processes")
+ parser.add_option("--rapi-ip", dest="rapi_ip",
+ default=constants.IP4_ADDRESS_LOCALHOST,
+ help="Use this IP to talk to RAPI.")
# See optparse documentation for why default values are not set by options
parser.set_defaults(wait_children=True)
options, args = parser.parse_args()
@@ -704,13 +707,13 @@ def _GlobalWatcher(opts):
# If RAPI isn't responding to queries, try one restart
logging.debug("Attempting to talk to remote API on %s",
- constants.IP4_ADDRESS_LOCALHOST)
- if not IsRapiResponding(constants.IP4_ADDRESS_LOCALHOST):
+ opts.rapi_ip)
+ if not IsRapiResponding(opts.rapi_ip):
logging.warning("Couldn't get answer from remote API, restaring daemon")
utils.StopDaemon(constants.RAPI)
utils.EnsureDaemon(constants.RAPI)
logging.debug("Second attempt to talk to remote API")
- if not IsRapiResponding(constants.IP4_ADDRESS_LOCALHOST):
+ if not IsRapiResponding(opts.rapi_ip):
logging.fatal("RAPI is not responding")
logging.debug("Successfully talked to remote API")
--
2.4.3.573.g4eafbef
LGTM