Re: [Zope] Frequent ZOPE crashes

2009-11-29 Thread Andreas Krasa
Am 25.11.09 17:37, schrieb Jaroslav Lukesh:
 At first, try to eliminate error outside of the Zope itself. Try to
 install it all into plain whole new (and reliable!) machine. Do not use
 restore of any backups!

 - Original Message - From: Andreas Krasa
 andreas.kr...@wu-wien.ac.at

 A week ago we switched to a new layout (for corporate reasons) and now
 we're experiencing frequent crashes of the Zope servers. Fortunately

Hi Jaroslav,

we're right in the process of tracking down the error outside of ZOPE.

We have completely installed a new server from scratch with RHEL 5.4 and 
have re-installed python 2.4.6 and the latest versions of libxml2 and 
libxslt there. We double checked the LD config, and made sure that te 
correct shared objects get loaded (via lsof).

We also reinstalled a few other modules that contain C-code (such as 
python-ldap) which we need for being able to do authenitcation.

Unfortunately that didn't really help much. We still experience crashes.

Are there any known issues with Zope 2.11.2, LibXML2 and/or LibXSLT that 
could cause these problems?

The only thing we re-used is the Data.fs, which we have to, because 
we're talking about a production system here.

Also note, that we have used excatly the same setup for a long time now, 
even on the same hardware, without any of these troubles. The problems 
only started when we switched over to a new (and probably more 
resource-intensive layout).

We're unfortunately still not able to reproduce these crashes.

Kind regards,
Andreas
___
Zope maillist  -  Zope@zope.org
https://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] Frequent ZOPE crashes

2009-11-29 Thread Andreas Krasa
Hi Tres,

thank you very much for your reply!

Am 29.11.09 21:57, schrieb Tres Seaver:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 - Original Message - From: Andreas Krasa
 andreas.kr...@wu-wien.ac.at

 we're right in the process of tracking down the error outside of ZOPE.

 We have completely installed a new server from scratch with RHEL 5.4 and
 have re-installed python 2.4.6 and the latest versions of libxml2 and
 libxslt there. We double checked the LD config, and made sure that te
 correct shared objects get loaded (via lsof).

 We also reinstalled a few other modules that contain C-code (such as
 python-ldap) which we need for being able to do authenitcation.

 Unfortunately that didn't really help much. We still experience crashes.

 Are there any known issues with Zope 2.11.2, LibXML2 and/or LibXSLT that
 could cause these problems?

 The only thing we re-used is the Data.fs, which we have to, because
 we're talking about a production system here.

 Also note, that we have used excatly the same setup for a long time now,
 even on the same hardware, without any of these troubles. The problems
 only started when we switched over to a new (and probably more
 resource-intensive layout).

 We're unfortunately still not able to reproduce these crashes.

 Can you set 'ulimit -c' to get a core file, which might at least help
 point to the extension which is to blame (although it may just show the
 downstream victim of a heap munge).

 What versions of libxml2 / libxslt are you using?  How about lxml?

Yes, we did set the ulimit and were indeed able to produce a coredump 
for each crash happening (each having something between 300 and 700 MB). 
We tried to debug using gdb but unfortunaley they only reveal two 
cases when the crashes occur:

1) During garbage collection where the gc tries to clean up damaged 
python objects
2) During some ceval process, also related to accessing damaged python 
objects

Unfortunately it doesn't reveal what exactly trashes the objects. To us 
it seems that this could happen some time earlier before either of the 
two processes mentioned above tries to access the objects and crashes ZOPE.

For now, we don't really see a reproduceable pattern as it seems to be a 
somewhat more complex user behavior which leads to this. We were able to 
extract a few URLs out of the coredumps but directly accessing those 
does nothing. Also the last logged access in the Z2.log before the 
coredump triggers nothing, when directly accessing it.

We're running ZOPE-2.11.2 with an eggified version of ZODB3-3.8.4 plus 
libxml2-2.7.6, libxslt-1.1.26 and lxml-2.2.4 now, the crashes still 
happen. Previously we've been running with ZOPE-2.11.2, libxml2-2.7.3, 
libxslt-1.1.24 and lxml-2.1.5. That also crashed ZOPE occasionally.

This only happened since we switched to a new layout (probably in 
combination with a few minor Silva updates).

We have been using the same system software (RHEL5), hardware, python 
version and libxml2/libxslt/lxml versions with our old old layout, where 
everything worked fine for years.

I would be happy to paste any particular gdb outputs if that is of any 
help...?

Kind regards,
Andreas
___
Zope maillist  -  Zope@zope.org
https://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] Frequent ZOPE crashes

2009-11-25 Thread Andreas Krasa
Hello Mailinglist,

we've been using ZOPE in combination with the Silva CMS for around four
years now to serve our University's homepage. And everything worked fine
so far.

A week ago we switched to a new layout (for corporate reasons) and now
we're experiencing frequent crashes of the Zope servers. Fortunately
enough the reconnect themselves to the ZODB but since this is now
happening around every five minutes, I'm rather worried that this might
permanently damage the ZODB.

I have absolutely no idea how this can happen, as we're using the same
python, libxml2, libxslt and other module versions as with the old
homepage - in fact the new site even runs on the same hardware. We never
experienced any problems like these up until now.

As far as I understood so far, it requires some C modules to
successfully cause ZOPE to segfault?

Versions we're using:

Python 2.4.6
Zope 2.11.2
LibXML2 2.7.3
LibXSLT 1.1.24
Python-LDAP 2.3.6
Setuptools 0.6c9
and a Kerberos Module

plus the Silva CMS (2.1) on top.

We have four ZOPE servers, each running two ZEO processes and a separate
ZODB. The machines all run RedHat Enterprise Linux 5.4. In front of that
Apache, Squid and Pound take care of the caching.

What we did was to examine the coredump-files with gdb but unfortunately
this didn't prove to be very helpful because either things go wrong
during garbage collection or some ceval stuff. So basically something
trashes certain python-objects at time before.

Do you have *any* hinst in how to track down this problem? Or are there
any known problems with the versions above? The changelogs didn't reveal
any plausible cause for me...

Kind regards,
Andreas Krasa
___
Zope maillist  -  Zope@zope.org
https://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 https://mail.zope.org/mailman/listinfo/zope-announce
 https://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] Zope.org = Zope.com?

2006-04-03 Thread Andreas Krasa
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello everybody,

it seems that currently all HTTP requests to www.zope.org and
dev.zope.org are forwarded to www.zope.com.

I find it hard to believe that this is a desired behavior...

But if yes, where can Zope itself and Zope products be downloaded? The
Zope.com site contains tons of nice eye-candy and marketing-yadayada but
no resources for developers whatsoever.

Regards,
Andreas Krasa

:: Andreas Krasa
   WU  ZID  Information Center :: fon: +43/1/31336/6996
   Augasse 2-6  1090 Wien  .at :: fax:  +43/1/31336/789
 :: icq:  2059600

:: pgp key-id: 0xDA178BDC
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.2.1 (MingW32)

iD8DBQFEMO42fmH5mdoXi9wRAmORAJ0QzixhttxVnX/3N0er70AgWMB15wCcDyHd
ufTmW/NYj8N6WIUCkyeZBYo=
=DdsT
-END PGP SIGNATURE-
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] ZEO troubles on RedHat EL4 Linux

2005-08-18 Thread Andreas Krasa // WUW
Dieter Maurer schrieb:
 Andreas Krasa // WUW wrote at 2005-8-16 18:37 +0200:
 
...
==
ERROR: checkMultipleAddresses
(ZEO.tests.testConnection.MappingStorageConnectionTests)
--
Traceback (most recent call last):
 File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
   os.waitpid(pid, 0)
OSError: [Errno 10] No child processes
 
 
 I have seen similar errors happening non deterministically
 in the presence of a SIGCHLD handler set to SIG_IGN.
 Such a handler causes the operating system to reap away
 so called zombie processes and if the zombie no longer exists,
 waitpid will fail.
 
 
 Some *nix variants automatically pass the SIG_IGN down to child processes.
 Our Debian and SuSE Linux versions do.
 I had to change Zope.Startup.run not to use SIG_IGN as
 SIGCHLD handler in order to avoid such problems.
 
 In case, you run your tests with zopectl test, you may
 see this problem...
 

Hi Dieter!

Thanks very much for your help! I will give this one a try!

Btw. since this also happens on 5 other machines - all natively
installed with RHEL4 - there actually might really be something wrong
within the OS.

Is that worth submitting a bug to RedHat? Or is ist more like a
feature? ;)

Thanks again,
Andreas
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] ZEO troubles on RedHat EL4 Linux

2005-08-18 Thread Andreas Krasa // WUW
Jens Vagelpohl schrieb:

 On 18 Aug 2005, at 07:50, Andreas Krasa // WUW wrote:
 
 Is that worth submitting a bug to RedHat? Or is ist more like a
 feature? ;)
 
 
 Why would RedHat care? They will just throw it back at you and say 
 sorry, Zope is not one of our supported packages.
 
 By the way, I hope you are not running Zope on the system-installed 
 Python? If you do, then change your setups to build and install your 
 own Python just for Zope and test again.
 
 jens

Hi Jens,

no, we've rebuilt python (2.3.5) from sources, and, as our main Zope
product Silva requires this, also libxml2 and libxslt (of course with
pointing to our own python). This stuff all resides in /usr/local. We've
compiled Zope pointing to /usr/local/bin/python23, so I guess that
RedHat's own python RPM does not interfere with Zope, at least I hope so.

As I understood Dieter's mail, this strange behavior is caused by the
way RedHat Enterprise Linux 4 system libraries handle SIG_IGN/SIGCHLD.

If this problem was due to some improper Zope methods, most people would
have this sort of problems. Which is not the case. That makes me believe
that the failure of ZEO tests actually is caused by some uncommon or
improper implementation of those two handles - which, in my opinion,
makes it something RedHat should take a look at.

Anyway - how severe are those testing failures for actually USING a ZEO
client/server on that particular OS as a production system?

Cheers,
Andreas
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] ZEO troubles on RedHat EL4 Linux

2005-08-16 Thread Andreas Krasa // ZID
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello everybody!

We are encountering some really strange problems with Zope 2.7.7 on our
RedHat EL 4 Linux machines.

During the Zope 2.7.7 compilation works - however most of the time make
test returns a random number of errors (somewhere between 20 and 30)
ALL related to ZEO.

The funny thing is, we've managed to do a make test without any
failures - however after doing a make distclean and compiling
everything again make test produces the above mentioned errors (using
*exactly* the same source code!).

I have absolutely no idea how this can happen - ANY hints are
appreciated! Is this a known issue? What could it be related to?

Thanks a lot!

Regards,
Andreas Krasa

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (MingW32)

iD8DBQFDAg7pfmH5mdoXi9wRAqkKAJ9oBzDN8WUzYYeNACVPJM0ifP4cwgCdFQh6
LPV9D5RElHRSbr256xj+HVY=
=qzGm
-END PGP SIGNATURE-
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] ZEO troubles on RedHat EL4 Linux

2005-08-16 Thread Andreas Krasa // WUW
Jens Vagelpohl schrieb:
 During the Zope 2.7.7 compilation works - however most of the time  make
 test returns a random number of errors (somewhere between 20 and 30)
 ALL related to ZEO.
 
 
 Maybe someone can help if you actually *tell us* what these errors  are.
 At least my own crystal ball is in the shop for repairs right  now... :)
 
 jens
 

Hi!

Oops, almost forgot about those - the errors are as follows. They are
always related to ZEO and an OSError No child processes.

Thanks  best regards,
Andreas Krasa

---

==
ERROR: checkMultipleAddresses
(ZEO.tests.testConnection.MappingStorageConnectionTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR: checkMultipleServers
(ZEO.tests.testConnection.MappingStorageConnectionTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR: checkReadOnlyClient
(ZEO.tests.testConnection.MappingStorageConnectionTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR: checkReadOnlyFallbackReadOnlyServer
(ZEO.tests.testConnection.MappingStorageConnectionTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR: checkReadOnlyFallbackWritable
(ZEO.tests.testConnection.MappingStorageConnectionTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR: checkReconnectWritable
(ZEO.tests.testConnection.MappingStorageConnectionTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR: checkReconnection
(ZEO.tests.testConnection.MappingStorageConnectionTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR: checkTimeout (ZEO.tests.testConnection.MappingStorageTimeoutTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR: checkTimeoutAfterVote
(ZEO.tests.testConnection.MappingStorageTimeoutTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR: checkTimeoutOnAbortNoLock
(ZEO.tests.testConnection.MappingStorageTimeoutTests)
--
Traceback (most recent call last):
  File
/usr/local/src/__zope__/Zope-2.7.7-final/lib/python/ZEO/tests/ConnectionTests.py,
line 121, in tearDown
os.waitpid(pid, 0)
OSError: [Errno 10] No child processes

==
ERROR

Re: [Zope] ZEO troubles on RedHat EL4 Linux

2005-08-16 Thread Andreas Krasa // WUW
Tim Peters schrieb:
 [Andreas Krasa]
 
We are encountering some really strange problems with Zope 2.7.7 on our
RedHat EL 4 Linux machines.

During the Zope 2.7.7 compilation works - however most of the time make
test returns a random number of errors (somewhere between 20 and 30)
ALL related to ZEO.

The funny thing is, we've managed to do a make test without any
failures - however after doing a make distclean and compiling
everything again make test produces the above mentioned errors (using
*exactly* the same source code!).

I have absolutely no idea how this can happen - ANY hints are
appreciated! Is this a known issue?
 
 
 No.  For example, it doesn't happen in the daily overnight testrunner reports.
 
 
What could it be related to?
 
 
 ZEO wink?  You'll have to give more info about which tests fail, and
 precisely how they fail.  Because many of the ZEO tests create
 multiple processes, and try to assign sockets so that these processes
 can communicate, they're vulnerable to vagaries of OS process
 scheduling and socket use by other apps.  For example, on a slow or
 overburdened (with other simultaneous work) machine, some ZEO tests
 can fail due to not getting enough cycles soon enough.  The worst
 tests of that sort wait as long as a minute now for another process to
 do something they're waiting for before failing, but not even
 waiting a minute can _guarantee_ success.
 
 Might be informative to run the tests on an otherwise-quiet machine.

Thank you Tim for the feedback!

Our system is a Intel Xeon 3 GHz Dual-CPU with 2.5 GB RAM running
RedHat Enterprise Linux 4 (SElinux disabled).

As this is a test-machine it doesn't run any CPU-consuming tasks I can
think of - the server load is usually somewhere between 0.00 and 0.10.

But I'll check that nevertheless!

Best regards
Andreas
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )