Re: [Zope] Re: Zope 2.8.4 strange behavior

2005-11-30 Thread Tim Peters
[Dieter Maurer]
> ...
> I think, Tim wanted to implement such a keep alive mechanism
> inside "ClientStorage" (to reliably detect disconnects) but
> in ZODB 3.4 it seems not yet available.

Right on all counts:  I would like to add that, because it's currently
possible for ZEO to run "forever" without noticing a connection is
dead (when the OS/whatever doesn't inform it of socket death); this is
especially damning for clients that normally don't try to commit
changes, as they can continue serving stale cached content
indefinitely.  And it's not in ZODB 3.4.  It's not in ZODBs 3.5 or 3.6
either -- haven't had time to work on it.
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Re: Zope 2.8.4 strange behavior

2005-11-30 Thread Paul Winkler
On Wed, Nov 30, 2005 at 08:40:34PM +0100, Dieter Maurer wrote:
> Florent Guillaume wrote at 2005-11-30 01:51 +0100:
> > ... sending keepalive messages to ZEO ...
> 
> >Why not use the max-disconnect-poll option of the zeoclient section in 
> >zope.conf ?
> 
> Our solution is quite old. At that time, there was definitely no
> "max-disconnect-poll" yet.

Aside from that, it's not even mentioned anywhere but ZODB/component.xml
so I had no idea it existed until now.

> In addition, "max-disconnect-poll" seems to target a completely different
> use case: to control the time between connection attempts.

I see. Thanks very much for your suggestion, Dieter - 
I'll look into that. It certainly sounds like we have the same symptom.

-- 

Paul Winkler
http://www.slinkp.com
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Re: Zope 2.8.4 strange behavior

2005-11-30 Thread Dieter Maurer
Florent Guillaume wrote at 2005-11-30 01:51 +0100:
> ... sending keepalive messages to ZEO ...

>Why not use the max-disconnect-poll option of the zeoclient section in 
>zope.conf ?

Our solution is quite old. At that time, there was definitely no
"max-disconnect-poll" yet.

In addition, "max-disconnect-poll" seems to target a completely different
use case: to control the time between connection attempts.

In our case, a connection was successfully established. However,
the firewall may cut it if it is inactive for a too long period --
in a way not noticed by the endpoints.

Thus, we must prevent the connection from being idle too long --
e.g. with a keepalive mechanism.


I think, Tim wanted to implement such a keep alive mechanism
inside "ClientStorage" (to reliably detect disconnects) but
in ZODB 3.4 it seems not yet available.

-- 
Dieter
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] Re: Zope 2.8.4 strange behavior

2005-11-29 Thread Florent Guillaume

Dieter Maurer wrote:

Paul Winkler wrote at 2005-11-28 15:37 -0500:


...


We had to implement a keep alive mechanism to prevent our firewall
from behaving in this nasty way.


OK. Can you give a high-level summary of what you did?  I thought of
using heartbeat to detect loss of connection, but I'm not sure what I
could do on failure short of restarting Zope.



We knew that our firewall shuts down connections with a timeout
of 30 min. Thus, we have send our ZEO a keep alive message
every 20 min. The code roughly looks like this:

KeepPeriod= int(environ.get('ZEO_KEEP_ALIVE')) * 60

Storage = getConfiguration().dbtab.getDatabase('/')._storage

def keepAlive():
LOG("CustomZODB",INFO,"Keep alive thread started")
while 1:
sleep(KeepPeriod)
if Storage._ready.isSet():
LOG("CustomZODB",INFO,"Sending keep alive message")
Storage._load_lock.acquire()
try:
try:
Storage._server.get_info()
LOG("CustomZODB",INFO,"Sent keep alive message")
except:
LOG("CustomZODB",ERROR," failed", error=exc_info())
finally: Storage._load_lock.release()
else:
LOG("CustomZODB",PROBLEM,"Connection is down")
start_new_thread(keepAlive,())


Why not use the max-disconnect-poll option of the zeoclient section in 
zope.conf ?


Florent

--
Florent Guillaume, Nuxeo (Paris, France)   Director of R&D
+33 1 40 33 71 59   http://nuxeo.com   [EMAIL PROTECTED]
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Re: Zope 2.8.4 strange behavior

2005-11-27 Thread Dennis Allison

I have not seen any of the threading error exceptions--but then the patch 
catches them in the ZMySQLDA adaptor and punts...

On Sun, 27 Nov 2005, Chris McDonough wrote:

> Does this mean that you haven't seen the errors since installing  
> Andy's patch?  If not, I'd declare victory and forget about using the  
> deadlock debugger (unless you want to do it for learning purposes only).
> 
> On Nov 27, 2005, at 8:46 PM, Dennis Allison wrote:
> 
> >
> > Just went throught that exercise with Andy and installed a patch to
> > MySQLDA that effectively ignores the 'release unlocked lock'  
> > problem that
> > has been plaguing us.   I shoulda guessed that is the first place  
> > to look.
> >
> > I'll get and install the DeadlockDebugger forthwith.
> >
> > Thanks.
> 

-- 

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Re: Zope 2.8.4 strange behavior

2005-11-27 Thread Chris McDonough
Does this mean that you haven't seen the errors since installing  
Andy's patch?  If not, I'd declare victory and forget about using the  
deadlock debugger (unless you want to do it for learning purposes only).


On Nov 27, 2005, at 8:46 PM, Dennis Allison wrote:



Just went throught that exercise with Andy and installed a patch to
MySQLDA that effectively ignores the 'release unlocked lock'  
problem that
has been plaguing us.   I shoulda guessed that is the first place  
to look.


I'll get and install the DeadlockDebugger forthwith.

Thanks.


___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Re: Zope 2.8.4 strange behavior

2005-11-27 Thread Dennis Allison

Just went throught that exercise with Andy and installed a patch to 
MySQLDA that effectively ignores the 'release unlocked lock' problem that 
has been plaguing us.   I shoulda guessed that is the first place to look.

I'll get and install the DeadlockDebugger forthwith.

Thanks.


On Mon, 28 Nov 2005, Florent Guillaume wrote:

> Dennis Allison wrote:
> > We have two recent instances in our production sites where Zope suddenly
> > stops responding.  It is not a new problem, but we've now been confronted
> > with two clean examples and nothing to blame them on.  The problem appears
> > to be independent of load as both incidents were on lightly loaded
> > machines.
> > 
> > A check of the logs (Linux and Zope) shows nothing obviously amiss except
> > that the trace log (the old -M log) shows a sudden increase in active
> > requests from the typical 0 or 1 to 1300 or more.  In this context an
> > "active request' is total number of requests pending at the end of this
> > request and is computed by post-processing.  We front-end Zope with pound 
> > and make heavy use of MySQL.  Both show a plethora of incomplete 
> > transactions.  
> > 
> > Examination of the raw trace log shows that Zope is continuing to accept
> > requests, but nothing getting done.  The raw log date-stamps four internal
> > states for each transaction.  The states are Begin (B), Input (I),
> > action (A), and End (E).  Inputs are gathered between B and I, outputs is
> > made between A and E.  The raw log shows B and I transactions, but
> > apparently no processing is completing.  I suspect that nothing is getting
> > scheduled.
> > 
> > I am at a loss as to where to begin to track this one down.  The failure
> > is spontaneous and apparently not triggered by any readily distinguishable
> > inputs or pattern of inputs.  The behavior smells a bit of resource limits
> > or process synchronization problems, but there is not real evidence for
> > either being the root cause.   I am not sure what monitoring I should be 
> > doing to help locate the source of the problem.
> > 
> > Has anyone seen seen a similar problem?  Any advice as to how to proceed?
> 
> Threads are hanging. You should install my DeadlockDebugger and track 
> down where the hung threads are blocked at.
> 
>  From the description I'd wager that you'll find your threads stuck in a 
> corner of the MySQL DA. In which case you'd have to find why it 
> deadlocks and find a fix.
> 
> Florent
> 
> 

-- 

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] Re: Zope 2.8.4 strange behavior

2005-11-27 Thread Florent Guillaume

Dennis Allison wrote:

We have two recent instances in our production sites where Zope suddenly
stops responding.  It is not a new problem, but we've now been confronted
with two clean examples and nothing to blame them on.  The problem appears
to be independent of load as both incidents were on lightly loaded
machines.

A check of the logs (Linux and Zope) shows nothing obviously amiss except
that the trace log (the old -M log) shows a sudden increase in active
requests from the typical 0 or 1 to 1300 or more.  In this context an
"active request' is total number of requests pending at the end of this
request and is computed by post-processing.  We front-end Zope with pound 
and make heavy use of MySQL.  Both show a plethora of incomplete 
transactions.  


Examination of the raw trace log shows that Zope is continuing to accept
requests, but nothing getting done.  The raw log date-stamps four internal
states for each transaction.  The states are Begin (B), Input (I),
action (A), and End (E).  Inputs are gathered between B and I, outputs is
made between A and E.  The raw log shows B and I transactions, but
apparently no processing is completing.  I suspect that nothing is getting
scheduled.

I am at a loss as to where to begin to track this one down.  The failure
is spontaneous and apparently not triggered by any readily distinguishable
inputs or pattern of inputs.  The behavior smells a bit of resource limits
or process synchronization problems, but there is not real evidence for
either being the root cause.   I am not sure what monitoring I should be 
doing to help locate the source of the problem.


Has anyone seen seen a similar problem?  Any advice as to how to proceed?


Threads are hanging. You should install my DeadlockDebugger and track 
down where the hung threads are blocked at.


From the description I'd wager that you'll find your threads stuck in a 
corner of the MySQL DA. In which case you'd have to find why it 
deadlocks and find a fix.


Florent

--
Florent Guillaume, Nuxeo (Paris, France)   Director of R&D
+33 1 40 33 71 59   http://nuxeo.com   [EMAIL PROTECTED]
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )