[Zope] Re: zope unresponsive

Paul Williams Tue, 27 Feb 2007 06:33:52 -0800


Tres Seaver wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Paul Williams wrote:
Ok, here is what we have. I did a netstat on both machines, client andserver. The client sees and established connection and the server doesnot. In the server log there is a disconnect. As far as hardwarebetween them, there is a switch (dell powerconnect 6024). Web ServerDirectors might get hold of it but there are no hops on traceroute.Traceroute only shows the client machine and the server machine.
So the client is just continuously polling the connection but gettingnothing back.
That sounds like some weird kernel / networking problem to me:  I don't
see how Zope could be able to keep calling 'select' on a socket after
the other side has closed it.


We agree.  This is a strange situation that none of us have seen before.

However, we have until tomorrow to do something and replacing hardwareis not feasable.


Is there any possibility that some kind of failover / IP takeover has
happened, such that the storage server now running is not the same host
/ instance as the one to shich the clients originally connected?  Are
you using LVS + heartbeat, or some kind of hardware load balancer to
manage such redundancy?

We do have Web Services Directors that do load balancing, but in thisparticular case, the storage server is not setup for load balancing, Iam not aware of any features that make the zodb capable of clusteringexcept for replication services offered through zope.

We are not sure whether the traffic is going to the Web ServicesDirectores or not. Even if it is, there are thousands of settings andthere is no-one available that knows what to change.



The storage server is a simple nas server with a static ip address.

What we are thinking about doing is changing the code inzrpc/connection.py to close the connection in wait (line 638 zopeversion 2.9.5) if the wait time gets too large or the poll has happenedtoo many times.
We are great at plone development, but have very little backend zopedevelopment. Would someone please advise me as to whether this is goingto cause more problems?
According to the log message you posted earlier in the thread, your
appservers are spewing thousands of log messages from the connection's
'pending' method, although your deadlock debugger output shows the one
thread blocked on 'select' inside of the connection's 'wait' method.
There should be lots of log messages at TRACE level for the wait call,
including a doubling / backoff of the delay value from 1 mx to 1 sec.
Do you see those log messages, as well?

These messages are there. You can see the time doubling. This is wherewe were thinking of breaking the connection once it gets to a certainpoint and make zope reconnect.

This solves our hung connection problem, we think. However, I am hopingsomeone can let me know if I am breaking something else by doing this.



Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          [EMAIL PROTECTED]
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF5Dvr+gerLs4ltQ4RAm/HAKCUN5WboOxVGeB11GhEfgYQ3wos3QCdH0TW
DbcpXiMPlcQYyx0gewPFMLI=
=9A/a
-----END PGP SIGNATURE-----

_______________________________________________
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **

(Related lists -http://mail.zope.org/mailman/listinfo/zope-announce

 http://mail.zope.org/mailman/listinfo/zope-dev )


_______________________________________________
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **

(Related lists -http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )

[Zope] Re: zope unresponsive

Reply via email to