Re: [ZODB-Dev] ZEO Client deadlocking in asyncore.poll - how to I debug

2008-04-07 Thread Alan Runyan
check out zeo server log files.  a known problem is people using iptables
or some sort of filtering between ZEO clients and ZEO server.  this config
took several hours off my life ;-(

On Mon, Apr 7, 2008 at 9:16 AM, Anton Stonor [EMAIL PROTECTED] wrote:
 We have a setup with a ZEO server and 4 ZEO clients.

 During the last weeks we have seen almost daily deadlocks in some of the ZEO
 clients. I've tried to wait for up to 30 minutes before restarting a client.

 I could need an advice on how to debug this.

 With DeadlockDebugger I see the same pattern each time:

 One thread is hanging:


  File /usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py, line 732,
 in setstate
self._setstate(obj)
  File /usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py, line 768,
 in _setstate
p, serial = self._storage.load(obj._p_oid, self._version)
  File /usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py, line 746,
 in load
return self.loadEx(oid, version)[:2]
  File /usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py, line 769,
 in loadEx
data, tid, ver = self._server.loadEx(oid, version)
  File /usr/local/www/zope-2.9.6/lib/python/ZEO/ServerStub.py, line 192, in
 loadEx
return self.rpc.call(loadEx, oid, version)
  File /usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py, line
 531, in call
r_flags, r_args = self.wait(msgid)
  File /usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py, line
 638, in wait
asyncore.poll(delay, self._singleton)
  File /usr/local/lib/python2.4/asyncore.py, line 122, in poll
r, w, e = select.select(r, w, e, timeout)


 The other threads of the ZEO client are waiting for the hanging thread to
 release the storage lock so that they can acquire it:

  File /usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py, line 760,
 in loadEx
self._load_lock.acquire()


 When I connect to the ZEO server monitor I can see an increasing number of
 reads (probably from the other ZEO Clients).

 I've set transaction-timeout 15.

 How to I dig further to resolve this?

 zeo.conf partly below:

 --
 zeo
  address 8200
  read-only false
  invalidation-queue-size 100
  # pid-filename $INSTANCE/var/ZEO.pid
  monitor-address 8201
  transaction-timeout 15
 /zeo

 filestorage 1
  path $INSTANCE/var/Data.fs
 /filestorage

 %import tempstorage
 temporarystorage temp
  name temporary storage for sessioning
 /temporarystorage
 --

 Anton



 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev




-- 
Alan Runyan
Enfold Systems, Inc.
http://www.enfoldsystems.com/
phone: +1.713.942.2377x111
fax: +1.832.201.8856
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] ZEO Client deadlocking in asyncore.poll - how to I debug

2008-04-07 Thread Roché Compaan
Check that your ZEO client cache size is big enough. If your code is
making queries that return more objects than the cache can hold it will
result in a state where the client needs to constantly load objects from
storage server. If you switch on debugging on the ZEO server you should
see what objects are being loaded. 

-- 
Roché Compaan
Upfront Systems   http://www.upfrontsystems.co.za

On Mon, 2008-04-07 at 16:16 +0200, Anton Stonor wrote:
 We have a setup with a ZEO server and 4 ZEO clients.
 
 During the last weeks we have seen almost daily deadlocks in some of the 
 ZEO clients. I've tried to wait for up to 30 minutes before restarting a 
 client.
 
 I could need an advice on how to debug this.
 
 With DeadlockDebugger I see the same pattern each time:
 
 One thread is hanging:
 
 
File /usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py, line 
 732, in setstate
  self._setstate(obj)
File /usr/local/www/zope-2.9.6/lib/python/ZODB/Connection.py, line 
 768, in _setstate
  p, serial = self._storage.load(obj._p_oid, self._version)
File /usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py, 
 line 746, in load
  return self.loadEx(oid, version)[:2]
File /usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py, 
 line 769, in loadEx
  data, tid, ver = self._server.loadEx(oid, version)
File /usr/local/www/zope-2.9.6/lib/python/ZEO/ServerStub.py, line 
 192, in loadEx
  return self.rpc.call(loadEx, oid, version)
File /usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py, 
 line 531, in call
  r_flags, r_args = self.wait(msgid)
File /usr/local/www/zope-2.9.6/lib/python/ZEO/zrpc/connection.py, 
 line 638, in wait
  asyncore.poll(delay, self._singleton)
File /usr/local/lib/python2.4/asyncore.py, line 122, in poll
  r, w, e = select.select(r, w, e, timeout)
 
 
 The other threads of the ZEO client are waiting for the hanging thread 
 to release the storage lock so that they can acquire it:
 
   File /usr/local/www/zope-2.9.6/lib/python/ZEO/ClientStorage.py, line 
 760, in loadEx
  self._load_lock.acquire()
 
 
 When I connect to the ZEO server monitor I can see an increasing number 
 of reads (probably from the other ZEO Clients).
 
 I've set transaction-timeout 15.
 
 How to I dig further to resolve this?
 
 zeo.conf partly below:
 
 --
 zeo
address 8200
read-only false
invalidation-queue-size 100
# pid-filename $INSTANCE/var/ZEO.pid
monitor-address 8201
transaction-timeout 15
 /zeo
 
 filestorage 1
path $INSTANCE/var/Data.fs
 /filestorage
 
 %import tempstorage
 temporarystorage temp
name temporary storage for sessioning
 /temporarystorage
 --
 
 Anton
 
 
 
 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/
 
 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Re: ZEO Client deadlocking in asyncore.poll - how to I debug

2008-04-07 Thread Anton Stonor

Thanks for you suggestions, Alan, Roché and Dieter,

I'll switch the zeo server logging to debug level even though the amount 
of data is scary -- and try to find a way to reduce the load on the ZEO 
server (Roché).


I think you (Alan and Dieter) might be right that there could be a 
network issue that gets triggered during high load. We don't have any 
apparent package filtering rules. Maybe having a closer look with 
tcpdump/wireshark could reveal something.


I'll keep you posted.

While we are working on getting to the root of this, isn't there a way 
to set a timeout on the client side, so it wont wait forever for a 
response that are lost in the mail?


Thanks again,

Anton

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev