Make sure PyOpenSSL's version. UPGRADE pyOpenSSL of all machines' to pyOpenSSL-0.6-2.el5. Check the hosts. Make sure all machine can resolve each other.
On Thu, Oct 20, 2011 at 12:07 AM, Alison Young <[email protected]>wrote: > Hello, > > We are seeing an occasional problem where restarts of funcd on the minions > are not successful and the func daemon is stopped but not able to start > again. > > Checking func.log gives: > > 2011-10-02 04:02:04,321 - INFO - Exception occured: socket.error > 2011-10-02 04:02:04,321 - INFO - Exception value: (98, 'Address already in > use') > 2011-10-02 04:02:04,322 - INFO - Exception Info: > File "/usr/bin/funcd", line 23, in ? > server.main(sys.argv) > File "/usr/lib/python2.4/site-packages/func/minion/server.py", line 413, > in main > serve() > File "/usr/lib/python2.4/site-packages/func/minion/server.py", line 225, > in serve > server = setup_server() > File "/usr/lib/python2.4/site-packages/func/minion/server.py", line 220, > in setup_server > server = FuncSSLXMLRPCServer((listen_addr, listen_port), > config.module_list) > File "/usr/lib/python2.4/site-packages/func/minion/server.py", line 279, > in __init__ > self.ca) > File > "/usr/lib/python2.4/site-packages/func/minion/AuthedXMLRPCServer.py", line > 74, in __init__ > SimpleXMLRPCServer.SimpleXMLRPCServer.__init__(self, address, > AuthedSimpleXMLRPCRequestHandler) > File "/usr/lib64/python2.4/SimpleXMLRPCServer.py", line 473, in __init__ > SocketServer.TCPServer.__init__(self, addr, requestHandler) > File "/usr/lib64/python2.4/SocketServer.py", line 330, in __init__ > self.server_bind() > File "/usr/lib64/python2.4/SocketServer.py", line 341, in server_bind > self.socket.bind(self.server_address) > File "<string>", line 1, in bind > > > As you may guess from the timestamp we are seeing this problem most often > at 4:02am on Sundays, i.e. as part of the logrotate of func logs. Logging in > to the server and starting the func service once we spot it is stopped has > always worked so far without needing manual removal of any pid or lock file. > > One theory is that this problem occurred when the func minion was > processing a command and told to restart part way through. From watching > netstat, it looks like the func daemon stops listening on the minion port to > allow the spawned process to communicate with the master. If the daemon > stops, the spawned process blocks a new daemon from starting ('Address > already in use') but that spawned process then exits and we're left with no > daemons. > > Does this ring any bells with anyone? Is this a known bug? > > We've already added monit to mop up after this, but it'd be much preferable > to find a proper fix. > > Alison > > _______________________________________________ > Func-list mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/func-list > -- -------------------------- 马新成 | Jackie Ma MSN: [email protected] QQ: 2252339967 Twitter: @JackieMa2 G+: Jackie Ma My_web: http://jackiema.blog.chinaunix.net http://cn.linkedin.com/in/jacknet 使IT运维简单,方便,智能,提高运维效率,节省人力
_______________________________________________ Func-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/func-list
