Re: [Zope] Re: Zope 2.8.4 strange behavior
[Dieter Maurer] > ... > I think, Tim wanted to implement such a keep alive mechanism > inside "ClientStorage" (to reliably detect disconnects) but > in ZODB 3.4 it seems not yet available. Right on all counts: I would like to add that, because it's currently possible for ZEO to run "forever" without noticing a connection is dead (when the OS/whatever doesn't inform it of socket death); this is especially damning for clients that normally don't try to commit changes, as they can continue serving stale cached content indefinitely. And it's not in ZODB 3.4. It's not in ZODBs 3.5 or 3.6 either -- haven't had time to work on it. ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Re: Zope 2.8.4 strange behavior
On Wed, Nov 30, 2005 at 08:40:34PM +0100, Dieter Maurer wrote: > Florent Guillaume wrote at 2005-11-30 01:51 +0100: > > ... sending keepalive messages to ZEO ... > > >Why not use the max-disconnect-poll option of the zeoclient section in > >zope.conf ? > > Our solution is quite old. At that time, there was definitely no > "max-disconnect-poll" yet. Aside from that, it's not even mentioned anywhere but ZODB/component.xml so I had no idea it existed until now. > In addition, "max-disconnect-poll" seems to target a completely different > use case: to control the time between connection attempts. I see. Thanks very much for your suggestion, Dieter - I'll look into that. It certainly sounds like we have the same symptom. -- Paul Winkler http://www.slinkp.com ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Re: Zope 2.8.4 strange behavior
Florent Guillaume wrote at 2005-11-30 01:51 +0100: > ... sending keepalive messages to ZEO ... >Why not use the max-disconnect-poll option of the zeoclient section in >zope.conf ? Our solution is quite old. At that time, there was definitely no "max-disconnect-poll" yet. In addition, "max-disconnect-poll" seems to target a completely different use case: to control the time between connection attempts. In our case, a connection was successfully established. However, the firewall may cut it if it is inactive for a too long period -- in a way not noticed by the endpoints. Thus, we must prevent the connection from being idle too long -- e.g. with a keepalive mechanism. I think, Tim wanted to implement such a keep alive mechanism inside "ClientStorage" (to reliably detect disconnects) but in ZODB 3.4 it seems not yet available. -- Dieter ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Re: Zope 2.8.4 strange behavior
Dieter Maurer wrote: Paul Winkler wrote at 2005-11-28 15:37 -0500: ... We had to implement a keep alive mechanism to prevent our firewall from behaving in this nasty way. OK. Can you give a high-level summary of what you did? I thought of using heartbeat to detect loss of connection, but I'm not sure what I could do on failure short of restarting Zope. We knew that our firewall shuts down connections with a timeout of 30 min. Thus, we have send our ZEO a keep alive message every 20 min. The code roughly looks like this: KeepPeriod= int(environ.get('ZEO_KEEP_ALIVE')) * 60 Storage = getConfiguration().dbtab.getDatabase('/')._storage def keepAlive(): LOG("CustomZODB",INFO,"Keep alive thread started") while 1: sleep(KeepPeriod) if Storage._ready.isSet(): LOG("CustomZODB",INFO,"Sending keep alive message") Storage._load_lock.acquire() try: try: Storage._server.get_info() LOG("CustomZODB",INFO,"Sent keep alive message") except: LOG("CustomZODB",ERROR," failed", error=exc_info()) finally: Storage._load_lock.release() else: LOG("CustomZODB",PROBLEM,"Connection is down") start_new_thread(keepAlive,()) Why not use the max-disconnect-poll option of the zeoclient section in zope.conf ? Florent -- Florent Guillaume, Nuxeo (Paris, France) Director of R&D +33 1 40 33 71 59 http://nuxeo.com [EMAIL PROTECTED] ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Re: Zope 2.8.4 strange behavior
I have not seen any of the threading error exceptions--but then the patch catches them in the ZMySQLDA adaptor and punts... On Sun, 27 Nov 2005, Chris McDonough wrote: > Does this mean that you haven't seen the errors since installing > Andy's patch? If not, I'd declare victory and forget about using the > deadlock debugger (unless you want to do it for learning purposes only). > > On Nov 27, 2005, at 8:46 PM, Dennis Allison wrote: > > > > > Just went throught that exercise with Andy and installed a patch to > > MySQLDA that effectively ignores the 'release unlocked lock' > > problem that > > has been plaguing us. I shoulda guessed that is the first place > > to look. > > > > I'll get and install the DeadlockDebugger forthwith. > > > > Thanks. > -- ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Re: Zope 2.8.4 strange behavior
Does this mean that you haven't seen the errors since installing Andy's patch? If not, I'd declare victory and forget about using the deadlock debugger (unless you want to do it for learning purposes only). On Nov 27, 2005, at 8:46 PM, Dennis Allison wrote: Just went throught that exercise with Andy and installed a patch to MySQLDA that effectively ignores the 'release unlocked lock' problem that has been plaguing us. I shoulda guessed that is the first place to look. I'll get and install the DeadlockDebugger forthwith. Thanks. ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] Re: Zope 2.8.4 strange behavior
Just went throught that exercise with Andy and installed a patch to MySQLDA that effectively ignores the 'release unlocked lock' problem that has been plaguing us. I shoulda guessed that is the first place to look. I'll get and install the DeadlockDebugger forthwith. Thanks. On Mon, 28 Nov 2005, Florent Guillaume wrote: > Dennis Allison wrote: > > We have two recent instances in our production sites where Zope suddenly > > stops responding. It is not a new problem, but we've now been confronted > > with two clean examples and nothing to blame them on. The problem appears > > to be independent of load as both incidents were on lightly loaded > > machines. > > > > A check of the logs (Linux and Zope) shows nothing obviously amiss except > > that the trace log (the old -M log) shows a sudden increase in active > > requests from the typical 0 or 1 to 1300 or more. In this context an > > "active request' is total number of requests pending at the end of this > > request and is computed by post-processing. We front-end Zope with pound > > and make heavy use of MySQL. Both show a plethora of incomplete > > transactions. > > > > Examination of the raw trace log shows that Zope is continuing to accept > > requests, but nothing getting done. The raw log date-stamps four internal > > states for each transaction. The states are Begin (B), Input (I), > > action (A), and End (E). Inputs are gathered between B and I, outputs is > > made between A and E. The raw log shows B and I transactions, but > > apparently no processing is completing. I suspect that nothing is getting > > scheduled. > > > > I am at a loss as to where to begin to track this one down. The failure > > is spontaneous and apparently not triggered by any readily distinguishable > > inputs or pattern of inputs. The behavior smells a bit of resource limits > > or process synchronization problems, but there is not real evidence for > > either being the root cause. I am not sure what monitoring I should be > > doing to help locate the source of the problem. > > > > Has anyone seen seen a similar problem? Any advice as to how to proceed? > > Threads are hanging. You should install my DeadlockDebugger and track > down where the hung threads are blocked at. > > From the description I'd wager that you'll find your threads stuck in a > corner of the MySQL DA. In which case you'd have to find why it > deadlocks and find a fix. > > Florent > > -- ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] Re: Zope 2.8.4 strange behavior
Dennis Allison wrote: We have two recent instances in our production sites where Zope suddenly stops responding. It is not a new problem, but we've now been confronted with two clean examples and nothing to blame them on. The problem appears to be independent of load as both incidents were on lightly loaded machines. A check of the logs (Linux and Zope) shows nothing obviously amiss except that the trace log (the old -M log) shows a sudden increase in active requests from the typical 0 or 1 to 1300 or more. In this context an "active request' is total number of requests pending at the end of this request and is computed by post-processing. We front-end Zope with pound and make heavy use of MySQL. Both show a plethora of incomplete transactions. Examination of the raw trace log shows that Zope is continuing to accept requests, but nothing getting done. The raw log date-stamps four internal states for each transaction. The states are Begin (B), Input (I), action (A), and End (E). Inputs are gathered between B and I, outputs is made between A and E. The raw log shows B and I transactions, but apparently no processing is completing. I suspect that nothing is getting scheduled. I am at a loss as to where to begin to track this one down. The failure is spontaneous and apparently not triggered by any readily distinguishable inputs or pattern of inputs. The behavior smells a bit of resource limits or process synchronization problems, but there is not real evidence for either being the root cause. I am not sure what monitoring I should be doing to help locate the source of the problem. Has anyone seen seen a similar problem? Any advice as to how to proceed? Threads are hanging. You should install my DeadlockDebugger and track down where the hung threads are blocked at. From the description I'd wager that you'll find your threads stuck in a corner of the MySQL DA. In which case you'd have to find why it deadlocks and find a fix. Florent -- Florent Guillaume, Nuxeo (Paris, France) Director of R&D +33 1 40 33 71 59 http://nuxeo.com [EMAIL PROTECTED] ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )