** Changed in: nova Status: Fix Committed => Fix Released ** Changed in: nova Milestone: None => juno-rc1
-- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1239864 Title: nova-api fails to query ServiceGroup status from Zookeeper Status in OpenStack Compute (Nova): Fix Released Bug description: I am running with the ZooKeeper servicegroup driver on CentOS 6.4 (Python 2.6) with the RDO distro of Grizzly. All nova services are successfully connecting to ZooKeeper, which I've verified using zkCli. However, when I run `nova service-list` I get an HTTP 500 error from nova-api. The nova-api log (/var/log/nova/api.log) shows: 2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack File "/usr/lib/python2.6/site-packages/nova/servicegroup/api.py"\ , line 93, in service_is_up 2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack return self._driver.is_up(member) 2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack File "/usr/lib/python2.6/site-packages/nova/servicegroup/drivers\ /zk.py", line 116, in is_up 2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack all_members = self.get_all(group_id) 2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack File "/usr/lib/python2.6/site-packages/nova/servicegroup/drivers\ /zk.py", line 141, in get_all 2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack raise exception.ServiceGroupUnavailable(driver="ZooKeeperDrive\ r") 2013-10-14 16:33:15.110 6748 TRACE nova.api.openstack ServiceGroupUnavailable: The service from servicegroup driver ZooK\ eeperDriver is temporarily unavailable. The problem seems to be around evzookeeper (using version 0.4.0). To isolate the problem, I added some evzookeeper.ZKSession synchronous get() calls to test the roundtrip communication to ZooKeeper. When I do a `self._session.get(CONF.zookeeper.sg_prefix)` in the zk.py ZooKeeperDriver __init__() method it works fine. The logs show that this is immediately before the wsgi server starts up. When I do the get() operation from within the ZooKeeperDriver get_all() method, the web request hangs indefinitely. However, if I recreate the evzookeeper.ZKSession within the get_all() method (after the wsgi server has started) the nova-api request is successful. diff --git a/nova/servicegroup/drivers/zk.py b/nova/servicegroup/drivers/zk.py index 2a3edae..7de2488 100644 --- a/nova/servicegroup/drivers/zk.py +++ b/nova/servicegroup/drivers/zk.py @@ -122,7 +122,14 @@ class ZooKeeperDriver(api.ServiceGroupDriver): monitor = self._monitors.get(group_id, None) if monitor is None: path = "%s/%s" % (CONF.zookeeper.sg_prefix, group_id) - monitor = membership.MembershipMonitor(self._session, path) + + null = open(os.devnull, "w") + local_session = evzookeeper.ZKSession(CONF.zookeeper.address, + recv_timeout= + CONF.zookeeper.recv_timeout, + zklog_fd=null) + + monitor = membership.MembershipMonitor(local_session, path) self._monitors[group_id] = monitor # Note(maoy): When initialized for the first time, it takes a # while to retrieve all members from zookeeper. To prevent To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1239864/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp