RE: [Zope-dev] Segfault and Deadlock
For another way round this issue of segfaults and deadlock when using python 2.2, has anyone tried running Zope with a python built to use the GNU Pth library instead of the system's pthread library? GNU Pth is an entirely user-space library, so I would think it's behavior would remain consistant regardless of the system's thread implementation. I'm not sure if the quality of that consistant level. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
I've submitted two patches to the python patch collector http://sourceforge.net/tracker/index.php?func=detailaid=949332group_id=5470atid=305470 is something that should probably work with any pthreads based Unix implementation. It simply unblocks the type of signals that are normally delivered synchronously and that the pthreads standard says should not be blocked. Another patch, http://sourceforge.net/tracker/?func=detailaid=948614group_id=5470atid=305470 redirects LinuxThreads asynchronous signals to Python's main thread. Right now it is done at compile time, but I think I can change this to a runtime check. As the patches are written, I doubt they can both be applied onto a standard Python. The purposes don't conflict, though and could probably both be used. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
Carl Witty [EMAIL PROTECTED] wrote on 05/04/2004 08:18:52 PM: I don't think it should be tested for in configure (or at compile-time at all). People will want to have binary distributions that work both with LinuxThreads and NPTL; some people actually switch back and forth on an application-by-application basis. It would be much better to check at runtime. You do have some good points. I did implement the compile time check, http://sourceforge.net/tracker/index.php?func=detailaid=948614group_id=5470atid=305470 but I can see if I can rework it in a way that wouldn't adversely affect other systems or NPTL systems. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
As Andrew Langmead has already discovered, the LinuxThreads issue with SIGSEGV was reported on the Python bug tracker almost a year ago (well, reported, but not diagnosed): SIGSEGV causes hung threads (Linux) http://www.python.org/sf/756924 Looks like: can't CNTRL-C when running os.system in a thread http://www.python.org/sf/756940 is related. python-dev'ers, do we have a release manager for 2.3.4 (I didn't see a resolution to the brouhaha at the end of March)? If so, is 2.3.4 still planned for this month? tick-tock-tick-tock-ing-ly y'rs - tim ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
On Mon, 2004-05-03 at 15:57, [EMAIL PROTECTED] wrote: Tim Peters [EMAIL PROTECTED] wrote on 05/03/2004 04:41:08 PM: [EMAIL PROTECTED] If someone cares enough to work up a patch, Python's patch tracker is open all night: http://sf.net/tracker/?atid=305470group_id=5470 I might be willing to try my hand at this, but I could use a tiny bit of guidance. (If you don't mind.) It seems that the patch should only be activated for LinuxThreads, and should be tested for in configure. I don't think it should be tested for in configure (or at compile-time at all). People will want to have binary distributions that work both with LinuxThreads and NPTL; some people actually switch back and forth on an application-by-application basis. It would be much better to check at runtime. Carl Witty ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Segfault and Deadlock
Dieter Maurer [EMAIL PROTECTED] wrote on 05/02/2004 01:28:48 PM: Willi Langenberger wrote at 2004-5-2 17:10 +0200: What is NPTL? It stands for Native POSIX Thread Library It is a new threads subsystem that is included in Linux 2.6 that Red Hat has backported into their 2.4 kernels. It has some performance advantages and has more correct POSIX behavior (especially in terms of signal handling.) over the older LinuxThreads system. ... PS: A RedHat-9 system (kernel 2.4.20, with NPTL) shows a different behaviour. After the segfault, all threads disappeared. So maybe all is ok with NPTL, but i've not tested it yet... That is the good behaviour. Thus, we only have to learn how we can get NPTL for all Linux systems. The choices seem to be to use a Linux 2.6 kernel, or to use a Red Hat 2.4 kernel with NPTL backported into it. (the earliest releases of Red Hat 9 had problems, but they seem to have been fixed in later kernel and glibc updates.) The older LinuxThreads library has a non-standard threading function pthread_kill_other_threads_np that can be used as a workaround to notify other threads of termination. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
Tim Peters wrote at 2004-5-2 23:16 -0400: ... Suppose a thread dies while holding the GIL (Python's global interpreter lock). Will the GIL be released so that another thread (including the main thread) can continue? There's no general answer to that. I expect that under *most* platform threading implementations, all threads will be dead in the water then, because threads are intentionally (by the OS and C runtime) lightweight objects under most implementations, and don't save away enough info to make it *possible* for the platform thread runtime to recover gracefully in case of thread disaster. That would not be necessary as long as all threads die. The reason why I believe Python is to blame: With Python 2.1.3, a SIGSEGV in one thread killed them all; with Python 2.3.3, a SIGSEGV in one thread kills one of them (the main thread, not the thread that got the SIGSEGV) but brings the others in a funny state. This is on the same OS (Linux 2.4 kernel without NPTL). Apparently, Python's handling of SIGSEGV signals changed between 2.1.3 and 2.3.3. In an earlier post, someone reported that Python explicitely blocks most signals in non-main threads. I verified that in the SIGSEGV case above, all remaining threads had SIGSEGV blocked. I may try to change Python to not block SIGSEGV and see whether we get again the old Python 2.1.3 behaviour. -- Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
Dieter Maurer [EMAIL PROTECTED] wrote on 05/03/2004 01:48:57 PM: The reason why I believe Python is to blame: With Python 2.1.3, a SIGSEGV in one thread killed them all; with Python 2.3.3, a SIGSEGV in one thread kills one of them (the main thread, not the thread that got the SIGSEGV) but brings the others in a funny state. You are right. This change: http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Python/thread_pthread.h?r1=2.32r2=2.33 causes new threads to be created with signals blocked. (the commit messages, and a lot of the threading code in Python talk about except for the main thread. I'm not sure if Python's threading abstraction has any concept of a main thread, but POSIX has none. All threads are peers.) http://pauillac.inria.fr/~xleroy/linuxthreads/faq.html#J.3 discusses how the POSIX spec defines asynchronous signals to be sent to the process as a whole, which runs afowl with the older Linux threading model, in which threads are really cleverly disguised processes and each thread has a PID. The switch in the signal handling between 2.1.3 and 2.3.3 (subsequent threads after the initial thread are created with signals blocked) explicitly triggers this LinuxThreads bug. This is on the same OS (Linux 2.4 kernel without NPTL). Apparently, Python's handling of SIGSEGV signals changed between 2.1.3 and 2.3.3. In an earlier post, someone reported that Python explicitely blocks most signals in non-main threads. I verified that in the SIGSEGV case above, all remaining threads had SIGSEGV blocked. I may try to change Python to not block SIGSEGV and see whether we get again the old Python 2.1.3 behaviour. -- Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope ) ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
[Dieter Maurer] The reason why I believe Python is to blame: Then this should really move to a Python bug tracker. With Python 2.1.3, a SIGSEGV in one thread killed them all; with Python 2.3.3, a SIGSEGV in one thread kills one of them (the main thread, not the thread that got the SIGSEGV) but brings the others in a funny state. This is on the same OS (Linux 2.4 kernel without NPTL). Apparently, Python's handling of SIGSEGV signals changed between 2.1.3 and 2.3.3. SIGSEGV is mentioned only in Python's signalmodule.c. You can use ViewCVS to show a diff between the 2.1.3 state of that (tag r213) and current HEAD. I don't see any possibly relevant differences: http://cvs.sf.net/viewcvs.py/python/python/dist/src/Modules/signalmodule.c In an earlier post, someone reported that Python explicitely blocks most signals in non-main threads. I'm not clear on exactly what blocked means. The comments at the top of signalmodule.c say: ... When threads are supported, we want the following semantics: - only the main thread can set a signal handler - any thread can get a signal handler - signals are only delivered to the main thread ... That's the intent. I verified that in the SIGSEGV case above, all remaining threads had SIGSEGV blocked. I may try to change Python to not block SIGSEGV and see whether we get again the old Python 2.1.3 behaviour. The relevant change is probably in Python/thread_pthread.h. Guido added a call to pthread_sigmask (or sigprocmask, depending on how broken the platform pthread support is ...), to PyThread__init_thread(), in revision 2.33. The checkin comment begins: Add SF patch #468347 -- mask signals for non-main pthreads, by Jason Lowe: This patch updates Python/thread_pthread.h to mask all signals for any thread created. This will keep all signals masked for any thread that isn't the initial thread. For Solaris and Linux, the two platforms I was able to test it on, it solves bug #465673 (pthreads need signal protection) and probably will solve bug #219772 (Interactive Interpreter+ Thread - core dump at exit). That was added before 2.1.3, but looks like it didn't get backported to the 2.1.3 maintenance branch before 2.1.3 was released. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
Tim Peters [EMAIL PROTECTED] wrote on 05/03/2004 03:47:31 PM: [Dieter Maurer] I'm not clear on exactly what blocked means. It has a very specific meaning with Unix signals. The kernel still has the signal for the process waiting in a queue, but the process has told the kernel that it is interested in receiving it yet. Blocking is set by the pthread_sigmask or the sigprocmask functions mentioned below. The comments at the top of signalmodule.c say: ... When threads are supported, we want the following semantics: - only the main thread can set a signal handler - any thread can get a signal handler - signals are only delivered to the main thread ... That's the intent. [stuff deleted] For a POSIX compatible pthread library, Python's current implementation, (set all signal handlers in the initial thread, start all subsequent threads with signals blocked) will produce the intended Python threading model behavior described above. For LinuxThreads, blocked signals in threads is exactly where it is imcompatible with POSIX. Since LinuxThreads are (not so) cleverly disguised processes, each with their own PID, signals can be sent to a thread and if blocked will never get rerouted to another thread. (When left to the default signal handling is to terminate, and a thread is left to the default the internal thread management will notice that one thread died of a signal and will handle the rest.) I verified that in the SIGSEGV case above, all remaining threads had SIGSEGV blocked. I may try to change Python to not block SIGSEGV and see whether we get again the old Python 2.1.3 behaviour. The relevant change is probably in Python/thread_pthread.h. Guido added a call to pthread_sigmask (or sigprocmask, depending on how broken the platform pthread support is ...), In order to get LinuxThreads to support the Python's threading semantics, what probably needs to be done is to have PyThread_init_thread set all handlers to call kill(main_thread, sig) to signal the main thread. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
[EMAIL PROTECTED] [... snip good explanations ...] In order to get LinuxThreads to support the Python's threading semantics, what probably needs to be done is to have PyThread_init_thread set all handlers to call kill(main_thread, sig) to signal the main thread. If someone cares enough to work up a patch, Python's patch tracker is open all night: http://sf.net/tracker/?atid=305470group_id=5470 ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
Tim Peters [EMAIL PROTECTED] wrote on 05/03/2004 04:41:08 PM: [EMAIL PROTECTED] If someone cares enough to work up a patch, Python's patch tracker is open all night: http://sf.net/tracker/?atid=305470group_id=5470 I might be willing to try my hand at this, but I could use a tiny bit of guidance. (If you don't mind.) It seems that the patch should only be activated for LinuxThreads, and should be tested for in configure. Is it reasonable to test for a LinuxThreads specific function (like pthread_kill_other_threads_np). Should I create a functional test that test tries to cause the LinuxThread specific behavior (cause a deadlock) and the notice the problem and fix it.Should I use the glibc feature getconf GNU_LIBPTHREAD_VERSION? The first is easiest to test for, but seems a little error prone. (what if someone else adds the non-standard function in order to ease porting from Linux? What if someone comes up with a LinuxThreads update that solves this problem?) Its testing a feature that is related to the feature I want info for, but not the troublesome behavior itself. The second solution seems to be one step away from the halting problem (although it might be able to be done with block signal_a, send signal_a, send signal_b, if signal_b is caught but not signal_a, then signals are not rerouted across threads.) The third option seems to be somewhere between the two (If getconf exists and the symbol doesn't, then we have older linuxthreads. If the getconf exists and the symbol returns linuxthreads, then we have newer linuxthreads. Otherwise assume a compliant pthread.) Is it reasonable to put a LinuxThreads specific replacement SET_THREAD_SIGMASK in thread_pthread.h? There are already a slew of system specific defines, and the differences don't seem extreme enough to make a separate thread_linuxthreads.h This has, of course, long veered off from being about zope development, so anyone wishing to contact me off list, feel free. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
[EMAIL PROTECTED], on special-casing LinuxThreads] I might be willing to try my hand at this, but I could use a tiny bit of guidance. (If you don't mind.) I don't mind wink, but I haven't run on Linux since 1994, and have lost track of how Unixish special-casing is done in Python since then. Best advice is to start with a bug report on Python's bug tracker, and perhaps a msg to mailto:[EMAIL PROTECTED] I think Martin v. Löwis is currently most knowledgeable about messy config issues in Python. It seems that the patch should only be activated for LinuxThreads, and should be tested for in configure. Sounds plausible, but I wouldn't know. Is it reasonable to test for a LinuxThreads specific function (like pthread_kill_other_threads_np). Should I create a functional test that test tries to cause the LinuxThread specific behavior (cause a deadlock) and the notice the problem and fix it.Should I use the glibc feature getconf GNU_LIBPTHREAD_VERSION? I don't know what's available in LinuxThreads *to* test. Most packages have some God-awful preprocessor #define to key off of. Also don't know whether the specific breakage at issue here is unique to LinuxThreads. The first is easiest to test for, but seems a little error prone. (what if someone else adds the non-standard function in order to ease porting from Linux? What if someone comes up with a LinuxThreads update that solves this problem?) Its testing a feature that is related to the feature I want info for, but not the troublesome behavior itself. I expect that's why most people settle for testing a package-specific #define. It's also why there's always at least some resistance to patches that do key off goofy symbols: the #ifdef'ed code will probably remain there forever, regardless of whether the problem still exists. So: The second solution seems to be one step away from the halting problem (although it might be able to be done with block signal_a, send signal_a, send signal_b, if signal_b is caught but not signal_a, then signals are not rerouted across threads.) An autoconf-able test that checks for the actual bad behavior would be best. ... Is it reasonable to put a LinuxThreads specific replacement SET_THREAD_SIGMASK in thread_pthread.h? Yes. There are already a slew of system specific defines, and the differences don't seem extreme enough to make a separate thread_linuxthreads.h Fully agreed. LinuxThreads is primarily pthreads with a bug. That makes it qualitatively the same as all other pthreads implementations wink. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Segfault and Deadlock
Willi Langenberger wrote at 2004-5-2 17:10 +0200: ... The reason is the way python handles threads on some systems (RedHat-7.3, kernel 2.4.20, without NPTL). What is NPTL? ... PS: A RedHat-9 system (kernel 2.4.20, with NPTL) shows a different behaviour. After the segfault, all threads disappeared. So maybe all is ok with NPTL, but i've not tested it yet... That is the good behaviour. Thus, we only have to learn how we can get NPTL for all Linux systems. By the way, nobody answered my problem report on comp.lang.python. Was maybe a bad time, during Pycon. -- Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Segfault and Deadlock
According to Dieter Maurer: The reason is the way python handles threads on some systems (RedHat-7.3, kernel 2.4.20, without NPTL). What is NPTL? Native POSIX Thread Library. That is the good behaviour. Thus, we only have to learn how we can get NPTL for all Linux systems. However, i dont know enough about NPTL. Only that it caused us some grief when we migrated applications from RedHat-7.3 to RedHat-9 (we had to set LD_ASSUME_KERNEL=2.4.1 for some applications [including oracle] to work). By the way, nobody answered my problem report on comp.lang.python. Was maybe a bad time, during Pycon. Yes, i think it is more a python problem than a zope problem. But it bites the Zope server on a linux system w/o NPTL. Maybe we have more luck this time... \wlang{} -- [EMAIL PROTECTED]Fax: +43/1/31336/9207 Zentrum fuer Informatikdienste, Wirtschaftsuniversitaet Wien, Austria ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Segfault and Deadlock
Am 2. Mai 2004 um 13:28 schrieb Dieter Maurer: Willi Langenberger wrote at 2004-5-2 17:10 +0200: ... The reason is the way python handles threads on some systems (RedHat-7.3, kernel 2.4.20, without NPTL). What is NPTL? The native posix thread library or something like that. It's a new threading implementation that was introduced with an update to RedHat9. Fedora Core hast it by default, as does RH Enterprise Server 3 I believe. jens smime.p7s Description: S/MIME cryptographic signature ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Segfault and Deadlock
[EMAIL PROTECTED] Hi Zope (and Python) experts! There seems to be a problem when an external python module segfaults during a zope request. The remaining worker threads are deadlocked. Maybe, maybe not. Python (and so also Zope) use platform-native thread facilities, and what happens when SIGSEGV gets signaled is mostly up to them. That's why you see different behavior, e.g., between Linux with NPTL and Linux without NPTL: the OS and C runtime determine exceptional thread semantics, and Python isn't the operating system. Suppose a thread dies while holding the GIL (Python's global interpreter lock). Will the GIL be released so that another thread (including the main thread) can continue? There's no general answer to that. I expect that under *most* platform threading implementations, all threads will be dead in the water then, because threads are intentionally (by the OS and C runtime) lightweight objects under most implementations, and don't save away enough info to make it *possible* for the platform thread runtime to recover gracefully in case of thread disaster. The natural (least effort) behavior is for the system to kill off the thread simply ignoring whatever resources it may be holding. In that case, all Python threads remaining will hang forever waiting to acquire the GIL. I expect the best that can be done, short of heroic effort (like writing your own platform thread implementation), is to document what the various thread implementations actually do. ... The reason is the way python handles threads on some systems (RedHat-7.3, kernel 2.4.20, without NPTL). If you search the Python implementation, you'll find that there's nothing different in what Python does depending on whether NPTL is present. On any system, all Python asks of the platform thread gimmicks is (a) a way to start a thread, and (b) a way to implement Python lock semantics. On any POSIX system, #b is done with POSIX semaphores #if defined(_POSIX_SEMAPHORES) !defined(HAVE_BROKEN_POSIX_SEMAPHORES) else #b is done with a combination of POSIX mutexes and POSIX condition variables. It could be that whether POSIX semaphores are available on Linux depends on whether NPTL is in place -- I don't know. But if so, that may be the relevant difference. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )