Re: Strange lockup with metacity
Chong Yidong skrev: There was one controversial patch applied to src/xterm.c at the beginning of November, which may have some bearing on this. This is just a hunch, but could you apply the reversed patch and see if the problem persists? Applying that patch fixed the problem. I'm much happier now - thanks! My preceding email to Jon was not sent to this list, but the patch in question was the _NET_ACTIVE_WINDOW hack discussed in the raise-frame doesn't work in Fedora Core 4 thread on emacs-devel. Apparently, the hack has bad side-effects. It is not a hack, it is following a Freedesktop specification. We have to revisit the whole specification after the release and probably add a lot more of these _NET_* settings. I'd say it is a bug in metacity (there are plenty already ...), but I've changed that bit in Emacs so we only send _NET_ACTIVATE_WINDOW on explicit raise-frame calls. Jonathan, can you test a newer CVS? Can you also check if raise-frame works on your old copy (i.e. the one that didn't hang metacity) of Emacs and the newer one? Thanks, Jan D. ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Strange lockup with metacity
Jan Djärv [EMAIL PROTECTED] wrote: I'd say it is a bug in metacity (there are plenty already ...), but I've changed that bit in Emacs so we only send _NET_ACTIVATE_WINDOW on explicit raise-frame calls. I have no trouble believing that it could be a metacity bug - but nothing else triggers it. Jonathan, can you test a newer CVS? Can you also check if raise-frame works on your old copy (i.e. the one that didn't hang metacity) of Emacs and the newer one? Nope. The behavior has changed, though. So, to summarize: - With 22.0.91: an attempt to resize the window will lock up every time. - 22.0.91 with the patch sent by Chong Yidong backed out: never locks up. - With CVS grabbed in the morning (US/Mountain) of November 30: locks up maybe one time in five - but still definitely locks up. It is, however, easier to unwedge: for whatever reason, switching to another virtual console and back shakes things loose. Hope that helps, jon Jonathan Corbet / LWN.net / [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Strange lockup with metacity
I'm seeing a strange problem with 22.0.91.1 on an x86-64 Fedora rawhide system (22.0.90 had it too). Almost everything works great, but any attempt to resize an emacs frame using the window manager locks things up. Essentially, metacity grabs the mouse then stops, waiting for something; the only way to get my desktop back is to restart metacity from somewhere else. Can you compile metacity with debugging symbols? You could run it under GDB, perhaps on another machine's console so you can still type at GDB even when it is hung. ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Strange lockup with metacity
My preceding email to Jon was not sent to this list, but the patch in question was the _NET_ACTIVE_WINDOW hack discussed in the raise-frame doesn't work in Fedora Core 4 thread on emacs-devel. Apparently, the hack has bad side-effects. If this causes serious problems on some systems, we have to take it out. But please don't delete the code. Please put #if 0 around it, and add another comment explaining the particulars of this problem. That might help us some time in the future figure out a better solution. ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Strange lockup with metacity
Jonathan Corbet skrev: Jan Dj채rv [EMAIL PROTECTED] wrote: I'd say it is a bug in metacity (there are plenty already ...), but I've changed that bit in Emacs so we only send _NET_ACTIVATE_WINDOW on explicit raise-frame calls. I have no trouble believing that it could be a metacity bug - but nothing else triggers it. Jonathan, can you test a newer CVS? Can you also check if raise-frame works on your old copy (i.e. the one that didn't hang metacity) of Emacs and the newer one? Nope. The behavior has changed, though. So, to summarize: - With 22.0.91: an attempt to resize the window will lock up every time. - 22.0.91 with the patch sent by Chong Yidong backed out: never locks up. - With CVS grabbed in the morning (US/Mountain) of November 30: locks up maybe one time in five - but still definitely locks up. It is, however, easier to unwedge: for whatever reason, switching to another virtual console and back shakes things loose. Hope that helps, Can you find out what version of metacity you have? If we can't fix this, we have to take out that code as Richard says. Too many people are using metacity. Funny, choosing between metacity bugs. At least the raise-frame bug doesn't cause a hang. Thanks, Jan D. ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Strange lockup with metacity
I'm seeing a strange problem with 22.0.91.1 on an x86-64 Fedora rawhide system (22.0.90 had it too). Almost everything works great, but any attempt to resize an emacs frame using the window manager locks things up. Essentially, metacity grabs the mouse then stops, waiting for something; the only way to get my desktop back is to restart metacity from somewhere else. This does not happen with any other application; it also does not happen with emacs 21. Clearly, emacs 22 pretest is doing something differently, and it's creating weirdness. I'm not sure how to try to debug this, but thought I would toss it out there. If there's further information I could get to help track it down, let me know and I'll do my best. Thanks, jon ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Strange lockup with metacity
Jonathan Corbet [EMAIL PROTECTED] writes: I'm seeing a strange problem with 22.0.91.1 on an x86-64 Fedora rawhide system (22.0.90 had it too). Almost everything works great, but any attempt to resize an emacs frame using the window manager locks things up. Essentially, metacity grabs the mouse then stops, waiting for something; the only way to get my desktop back is to restart metacity from somewhere else. This does not happen with any other application; it also does not happen with emacs 21. Clearly, emacs 22 pretest is doing something differently, and it's creating weirdness. I'm not sure how to try to debug this, but thought I would toss it out there. If there's further information I could get to help track it down, let me know and I'll do my best. It does not happen for me on Ubuntu Dapper (Metacity 2.14.5). We need more information: does it happen with `emacs -Q', and when Emacs is compiled with/without GTK support? Please provide the information given using M-x report-emacs-bug RET. ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Strange lockup with metacity
Chong Yidong [EMAIL PROTECTED] wrote: There was one controversial patch applied to src/xterm.c at the beginning of November, which may have some bearing on this. This is just a hunch, but could you apply the reversed patch and see if the problem persists? Applying that patch fixed the problem. I'm much happier now - thanks! jon Jonathan Corbet / LWN.net / [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
YAMAMOTO Mitsuharu skrev: On Thu, 10 Aug 2006 13:17:03 +0200, Jan Djärv [EMAIL PROTECTED] said: My intention was that the above scenario would be avoided with BLOCK_INPUT around functions that may call malloc-related functions. It does not help if the calling thread is one of the Gnoem threads. But a signal delivered to a non-main thread is redirected to the main thread by SIGNAL_THREAD_CHECK. A signal yes, but I was thinking of this scenario: A Gnome thread does malloc, gets the mutex lock and enters the malloc code. A signal is delivered (in the main thread as you point out) and enters malloc also. This situation is exactly like the one with the lockup, but here we can't use BLOCK_INPUT around the malloc related functions because they are in the Gnome code. How about just changing the order of lock/unlock and BLOCK_INPUT/UNBLOCK_INPUT in the previous version of BLOCK_INPUT_ALLOC/UNBLOCK_INPUT_ALLOC? That would mean that lock/unlock mutex functions are called in the signal handler context, which is not allowed according to the documentation. Yes, pthread_mutex_(un)lock is not async-signal-safe. But we are already using such functions as malloc in the signal handler context (with the help of BLOCK_INPUT). I guess calling pthread_mutex_(un)lock in the signal handler context is safe in reality unless the interrupted thread is also executing pthread_mutex_(un)lock for the same mutex. I think it's better than the current one, i.e., not protecting shared resources such as __malloc_hook in the signal handler context. I agree with your assumtion that the lockuo occurs because the signal handler and the interrupted therad are calling pthread_mutex_(un)lock for the same mutex. But BLOCK_INPUT does not help, because Gnome code does not have it. So I tried to do the next best thing, i.e. block SIGIO in non-main threads. The problem with this is that I can't block SIGIO before taking the mutex, because if I hang when taking the mutex, SIGIO would remain blocked. One could use trylock and some sort of busy loop, but I don't think that is usable. (Of course SYNC_INPUT is the right direction, but the current plan is not enabling it in the next release as far as I understand.) Unless someone comes up with a supersafe scheme I think we have to live with this race condition until then. But it is better now than before, SIGIO and the main thread executing the sam (un)lock should not lockup. But if the signal handler is executing in one thread one one processor and a Gnome thread is executing on another processor, there could be a lockup. Jan D. ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
YAMAMOTO Mitsuharu [EMAIL PROTECTED] writes: Yes, pthread_mutex_(un)lock is not async-signal-safe. But we are already using such functions as malloc in the signal handler context (with the help of BLOCK_INPUT). I guess calling pthread_mutex_(un)lock in the signal handler context is safe in reality unless the interrupted thread is also executing pthread_mutex_(un)lock for the same mutex. I guess ... safe in reality unless ... Maybe I have been around programmers too long, but I don't find this exactly reassuring. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
On Fri, 11 Aug 2006 08:36:39 +0200, Jan Djärv [EMAIL PROTECTED] said: A signal yes, but I was thinking of this scenario: A Gnome thread does malloc, gets the mutex lock and enters the malloc code. A signal is delivered (in the main thread as you point out) and enters malloc also. This situation is exactly like the one with the lockup, but here we can't use BLOCK_INPUT around the malloc related functions because they are in the Gnome code. I think such a case just behaves like a usual mutual exclusion between multiple threads: one thread acquires a mutex, and the other blocks until it is released. I agree with your assumtion that the lockuo occurs because the signal handler and the interrupted therad are calling pthread_mutex_(un)lock for the same mutex. But BLOCK_INPUT does not help, because Gnome code does not have it. That's not a problem because Gnome threads (non-main threads) never execute pthread_mutex_(un)lock in the signal hander context. YAMAMOTO Mitsuharu [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
On Fri, 11 Aug 2006 09:04:34 +0200, David Kastrup [EMAIL PROTECTED] said: YAMAMOTO Mitsuharu [EMAIL PROTECTED] writes: Yes, pthread_mutex_(un)lock is not async-signal-safe. But we are already using such functions as malloc in the signal handler context (with the help of BLOCK_INPUT). I guess calling pthread_mutex_(un)lock in the signal handler context is safe in reality unless the interrupted thread is also executing pthread_mutex_(un)lock for the same mutex. I guess ... safe in reality unless ... Maybe I have been around programmers too long, but I don't find this exactly reassuring. Yeah. IEEE Std 1003.1 provides a table of async-signal-safe functions (those can be called safely within a signal handler), and neither malloc nor pthread_mutex_(un)lock is such functions. All functions not in the above table are considered to be unsafe with respect to signals. In the presence of signals, all functions defined by this volume of IEEE Std 1003.1-2001 shall behave as defined when called from or interrupted by a signal-catching function, with a single exception: when a signal interrupts an unsafe function and the signal-catching function calls an unsafe function, the behavior is undefined. (http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html) We already have malloc calls in the signal handler context with the assumption that it is safe to call unless the signal interrupts malloc-related functions. So I think it's not that bad to also put reasonable assumptions to pthread_mutex_(un)lock. YAMAMOTO Mitsuharu [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
On Fri, 11 Aug 2006 10:09:49 +0200, Jan Djärv [EMAIL PROTECTED] said: That's not a problem because Gnome threads (non-main threads) never execute pthread_mutex_(un)lock in the signal hander context. That does not help, the main thread executes in signal handler context sometimes. And in that case, both the Gnome thread and the signal handler may be executing (un)lock_mutex on the same mutex. I don't think this causes a problem. The signal handler is executed in the main thread that is different from the Gnome thread. And as I quoted from IEEE Std 1003.1 in another message, a pthread_mutex_(un)lock call in the signal hander context should work as usual unless the signal interrupted an unsafe function. The condition unless the signal interrupted an unsafe function is too strict in reality. My guess was that it could be relaxed to unless the signal interrupted a pthread_mutex_(un)lock call for the same mutex. YAMAMOTO Mitsuharu [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
David Kastrup skrev: Hi, I just had a lockup occuring. Here is a backtrace: I've checked in a fix, but I beleive the race condition still exists on multiprocessor machines. I can't see a way to fix that except move to SYNC_INPUT. Jan D. If emacs crashed, and you have the emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. If you would like to further debug the crash, please read the file /usr/local/emacs-21/share/emacs/22.0.50/etc/DEBUG for instructions. (gdb) bt #0 0xe410 in __kernel_vsyscall () #1 0xb79602ae in __lll_mutex_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0xb795cfc7 in _L_mutex_lock_159 () from /lib/tls/i686/cmov/libpthread.so.0 #3 0x0063 in ?? () #4 0xbfabe690 in ?? () #5 0x086a6d48 in ?? () #6 0xbfabe6d8 in ?? () #7 0x080c78f0 in handle_one_xevent (dpyinfo=0x82f375c, eventp=0xbfabdbec, finish=0xb79602ae, hold_quit=0x82f375c) at /home/tmp/emacs/src/xterm.c:6916 #8 0x0813b963 in emacs_blocked_malloc (size=4294967292, ptr=0xb7993042) at /home/tmp/emacs/src/alloc.c:1231 #9 0xb77a83c5 in malloc () from /lib/tls/i686/cmov/libc.so.6 #10 0xb7993042 in g_malloc () from /usr/lib/libglib-2.0.so.0 #11 0xb79a2e27 in g_strndup () from /usr/lib/libglib-2.0.so.0 #12 0xb7977fa2 in g_convert_with_fallback () from /usr/lib/libglib-2.0.so.0 #13 0xb79780f5 in g_locale_from_utf8 () from /usr/lib/libglib-2.0.so.0 #14 0xb7c84b5e in gdk_add_client_message_filter () from /usr/lib/libgdk-x11-2.0.so.0 #15 0xb7c855b2 in gdk_x11_register_standard_event_type () from /usr/lib/libgdk-x11-2.0.so.0 #16 0xb7c86c78 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0 #17 0xb7c86dc1 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0 #18 0xb798b8d6 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0 #19 0xb798e996 in g_main_context_check () from /usr/lib/libglib-2.0.so.0 #20 0xb798ee1e in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0 #21 0xb7de1f74 in gtk_main_iteration () from /usr/lib/libgtk-x11-2.0.so.0 #22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac) at /home/tmp/emacs/src/xterm.c:7067 #23 0x080f90dc in read_avail_input (expected=1) at /home/tmp/emacs/src/keyboard.c:6737 #24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883 #25 0x080f9317 in input_available_signal (signo=29) at /home/tmp/emacs/src/keyboard.c:6925 #26 signal handler called #27 0xb795d27d in pthread_mutex_unlock () from /lib/tls/i686/cmov/libpthread.so.0 #28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6 #29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6 #30 0x08122c57 in directory_files_internal_unwind (dh=161026626) at /home/tmp/emacs/src/dired.c:137 #31 0x08151a4f in unbind_to (count=528, value=137410761) at /home/tmp/emacs/src/eval.c:3337 #32 0x0812401e in file_name_completion (file=160401147, dirname=160401115, all_flag=1, ver_flag=0) at /home/tmp/emacs/src/dired.c:721 #33 0x0812447c in Ffile_name_all_completions (file=160401147, directory=160401115) at /home/tmp/emacs/src/dired.c:442 #34 0x0815312f in Ffuncall (nargs=3, args=0xbfabfb50) at /home/tmp/emacs/src/eval.c:2985 #35 0x0817c99e in Fbyte_code (bytestr=136162331, vector=136162356, maxdepth=32) at /home/tmp/emacs/src/bytecode.c:679 #36 0x081526d8 in Feval (form=136162317) at /home/tmp/emacs/src/eval.c:2319 #37 0x08154991 in internal_lisp_condition_case (var=137410761, bodyform=136162317, handlers=136162453) at /home/tmp/emacs/src/eval.c:1414 #38 0x0817db50 in Fbyte_code (bytestr=136162139, vector=136162156, maxdepth=48) at /home/tmp/emacs/src/bytecode.c:869 #39 0x08152b5a in funcall_lambda (fun=136162100, nargs=1, arg_vector=0xbfabfe94) at /home/tmp/emacs/src/eval.c:3169 #40 0x08152fb0 in Ffuncall (nargs=2, args=0xbfabfe90) at /home/tmp/emacs/src/eval.c:3028 #41 0x0817c99e in Fbyte_code (bytestr=149470715, vector=149471668, maxdepth=56) at /home/tmp/emacs/src/bytecode.c:679 #42 0x08152b5a in funcall_lambda (fun=149031548, nargs=4, arg_vector=0xbfabffc4) at /home/tmp/emacs/src/eval.c:3169 #43 0x08152fb0 in Ffuncall (nargs=5, args=0xbfabffc0) at /home/tmp/emacs/src/eval.c:3028 #44 0x0817c99e in Fbyte_code (bytestr=143938675, vector=143888644, maxdepth=40) at /home/tmp/emacs/src/bytecode.c:679 #45 0x08152b5a in funcall_lambda (fun=143876164, nargs=0, arg_vector=0xbfac00f4) at /home/tmp/emacs/src/eval.c:3169 #46 0x08152fb0 in Ffuncall (nargs=1, args=0xbfac00f0) at /home/tmp/emacs/src/eval.c:3028 #47 0x0817c99e in Fbyte_code (bytestr=144059891, vector=144063476, maxdepth=56) at /home/tmp/emacs/src/bytecode.c:679 #48 0x08152b5a in funcall_lambda (fun=144063812, nargs=1, arg_vector=0xbfac0224) at /home/tmp/emacs/src/eval.c:3169 #49 0x08152fb0 in Ffuncall (nargs=2, args=0xbfac0220) at /home/tmp/emacs/src/eval.c:3028 #50 0x0817c99e in Fbyte_code (bytestr=144059651, vector
Re: Lockup
On Thu, 10 Aug 2006 08:20:24 +0200, Jan Djärv [EMAIL PROTECTED] said: Hi, I just had a lockup occuring. Here is a backtrace: I've checked in a fix, but I beleive the race condition still exists on multiprocessor machines. I can't see a way to fix that except move to SYNC_INPUT. Maybe I'm missing something, but doesn't adding BLOCK_INPUT around closedir (and opendir) help? YAMAMOTO Mitsuharu [EMAIL PROTECTED] #22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac) at /home/tmp/emacs/src/xterm.c:7067 #23 0x080f90dc in read_avail_input (expected=1) at /home/tmp/emacs/src/keyboard.c:6737 #24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883 #25 0x080f9317 in input_available_signal (signo=29) at /home/tmp/emacs/src/keyboard.c:6925 #26 signal handler called #27 0xb795d27d in pthread_mutex_unlock () from /lib/tls/i686/cmov/libpthread.so.0 #28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6 #29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6 #30 0x08122c57 in directory_files_internal_unwind (dh=161026626) at /home/tmp/emacs/src/dired.c:137 ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
YAMAMOTO Mitsuharu skrev: On Thu, 10 Aug 2006 08:20:24 +0200, Jan Djärv [EMAIL PROTECTED] said: Hi, I just had a lockup occuring. Here is a backtrace: I've checked in a fix, but I beleive the race condition still exists on multiprocessor machines. I can't see a way to fix that except move to SYNC_INPUT. Maybe I'm missing something, but doesn't adding BLOCK_INPUT around closedir (and opendir) help? In this particular case it would help, but in general the problem is that the signal handler gets called when the main thread is executing in the mutex code (pthread_mutex_unlock). Later when the signal handler tries to get the mutex, it locks up, it is actually not allowed to call mutex functions from a signal handler. The mutex is there to protect from other (Gnome) threads that also call malloc/free. But if a Gnome thread has the mutex and before it has blocked signals, the signal handler is run in parallell on another processor, there will be problems. If we move to SYNC_INPUT there will be no malloc/free called in the signal handler and we only need the mutex to hamdle concurrent access. Jan D. #22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac) at /home/tmp/emacs/src/xterm.c:7067 #23 0x080f90dc in read_avail_input (expected=1) at /home/tmp/emacs/src/keyboard.c:6737 #24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883 #25 0x080f9317 in input_available_signal (signo=29) at /home/tmp/emacs/src/keyboard.c:6925 #26 signal handler called #27 0xb795d27d in pthread_mutex_unlock () from /lib/tls/i686/cmov/libpthread.so.0 #28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6 #29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6 #30 0x08122c57 in directory_files_internal_unwind (dh=161026626) at /home/tmp/emacs/src/dired.c:137 ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
On Thu, 10 Aug 2006 10:11:55 +0200, Jan Djärv [EMAIL PROTECTED] said: Maybe I'm missing something, but doesn't adding BLOCK_INPUT around closedir (and opendir) help? In this particular case it would help, but in general the problem is that the signal handler gets called when the main thread is executing in the mutex code (pthread_mutex_unlock). Later when the signal handler tries to get the mutex, it locks up, My intention was that the above scenario would be avoided with BLOCK_INPUT around functions that may call malloc-related functions. it is actually not allowed to call mutex functions from a signal handler. The mutex is there to protect from other (Gnome) threads that also call malloc/free. But if a Gnome thread has the mutex and before it has blocked signals, the signal handler is run in parallell on another processor, there will be problems. If we move to SYNC_INPUT there will be no malloc/free called in the signal handler and we only need the mutex to hamdle concurrent access. The current version would cause such a problem because now BLOCK_INPUT_ALLOC and UNBLOCK_INPUT_ALLOC are no-op in a signal handler. How about just changing the order of lock/unlock and BLOCK_INPUT/UNBLOCK_INPUT in the previous version of BLOCK_INPUT_ALLOC/UNBLOCK_INPUT_ALLOC? #define BLOCK_INPUT_ALLOC \ do\ { \ if (pthread_self () == main_thread) \ BLOCK_INPUT;\ pthread_mutex_lock (alloc_mutex);\ } \ while (0) #define UNBLOCK_INPUT_ALLOC \ do\ { \ pthread_mutex_unlock (alloc_mutex); \ if (pthread_self () == main_thread) \ UNBLOCK_INPUT; \ } \ while (0) YAMAMOTO Mitsuharu [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
YAMAMOTO Mitsuharu skrev: On Thu, 10 Aug 2006 10:11:55 +0200, Jan Djärv [EMAIL PROTECTED] said: In this particular case it would help, but in general the problem is that the signal handler gets called when the main thread is executing in the mutex code (pthread_mutex_unlock). Later when the signal handler tries to get the mutex, it locks up, My intention was that the above scenario would be avoided with BLOCK_INPUT around functions that may call malloc-related functions. It does not help if the calling thread is one of the Gnoem threads. How about just changing the order of lock/unlock and BLOCK_INPUT/UNBLOCK_INPUT in the previous version of BLOCK_INPUT_ALLOC/UNBLOCK_INPUT_ALLOC? #define BLOCK_INPUT_ALLOC \ do\ { \ if (pthread_self () == main_thread) \ BLOCK_INPUT;\ pthread_mutex_lock (alloc_mutex);\ } \ while (0) #define UNBLOCK_INPUT_ALLOC \ do\ { \ pthread_mutex_unlock (alloc_mutex); \ if (pthread_self () == main_thread) \ UNBLOCK_INPUT; \ } \ while (0) That would mean that lock/unlock mutex functions are called in the signal handler context, which is not allowed according to the documentation. Jan D. ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Re: Lockup
On Thu, 10 Aug 2006 13:17:03 +0200, Jan Djärv [EMAIL PROTECTED] said: My intention was that the above scenario would be avoided with BLOCK_INPUT around functions that may call malloc-related functions. It does not help if the calling thread is one of the Gnoem threads. But a signal delivered to a non-main thread is redirected to the main thread by SIGNAL_THREAD_CHECK. How about just changing the order of lock/unlock and BLOCK_INPUT/UNBLOCK_INPUT in the previous version of BLOCK_INPUT_ALLOC/UNBLOCK_INPUT_ALLOC? That would mean that lock/unlock mutex functions are called in the signal handler context, which is not allowed according to the documentation. Yes, pthread_mutex_(un)lock is not async-signal-safe. But we are already using such functions as malloc in the signal handler context (with the help of BLOCK_INPUT). I guess calling pthread_mutex_(un)lock in the signal handler context is safe in reality unless the interrupted thread is also executing pthread_mutex_(un)lock for the same mutex. I think it's better than the current one, i.e., not protecting shared resources such as __malloc_hook in the signal handler context. (Of course SYNC_INPUT is the right direction, but the current plan is not enabling it in the next release as far as I understand.) YAMAMOTO Mitsuharu [EMAIL PROTECTED] ___ emacs-pretest-bug mailing list emacs-pretest-bug@gnu.org http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug
Lockup
Please write in English if possible, because the Emacs maintainers usually do not have translators to read other languages for them. Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list. Please describe exactly what actions triggered the bug and the precise symptoms of the bug: Hi, I just had a lockup occuring. Here is a backtrace: If emacs crashed, and you have the emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. If you would like to further debug the crash, please read the file /usr/local/emacs-21/share/emacs/22.0.50/etc/DEBUG for instructions. (gdb) bt #0 0xe410 in __kernel_vsyscall () #1 0xb79602ae in __lll_mutex_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0xb795cfc7 in _L_mutex_lock_159 () from /lib/tls/i686/cmov/libpthread.so.0 #3 0x0063 in ?? () #4 0xbfabe690 in ?? () #5 0x086a6d48 in ?? () #6 0xbfabe6d8 in ?? () #7 0x080c78f0 in handle_one_xevent (dpyinfo=0x82f375c, eventp=0xbfabdbec, finish=0xb79602ae, hold_quit=0x82f375c) at /home/tmp/emacs/src/xterm.c:6916 #8 0x0813b963 in emacs_blocked_malloc (size=4294967292, ptr=0xb7993042) at /home/tmp/emacs/src/alloc.c:1231 #9 0xb77a83c5 in malloc () from /lib/tls/i686/cmov/libc.so.6 #10 0xb7993042 in g_malloc () from /usr/lib/libglib-2.0.so.0 #11 0xb79a2e27 in g_strndup () from /usr/lib/libglib-2.0.so.0 #12 0xb7977fa2 in g_convert_with_fallback () from /usr/lib/libglib-2.0.so.0 #13 0xb79780f5 in g_locale_from_utf8 () from /usr/lib/libglib-2.0.so.0 #14 0xb7c84b5e in gdk_add_client_message_filter () from /usr/lib/libgdk-x11-2.0.so.0 #15 0xb7c855b2 in gdk_x11_register_standard_event_type () from /usr/lib/libgdk-x11-2.0.so.0 #16 0xb7c86c78 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0 #17 0xb7c86dc1 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0 #18 0xb798b8d6 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0 #19 0xb798e996 in g_main_context_check () from /usr/lib/libglib-2.0.so.0 #20 0xb798ee1e in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0 #21 0xb7de1f74 in gtk_main_iteration () from /usr/lib/libgtk-x11-2.0.so.0 #22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac) at /home/tmp/emacs/src/xterm.c:7067 #23 0x080f90dc in read_avail_input (expected=1) at /home/tmp/emacs/src/keyboard.c:6737 #24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883 #25 0x080f9317 in input_available_signal (signo=29) at /home/tmp/emacs/src/keyboard.c:6925 #26 signal handler called #27 0xb795d27d in pthread_mutex_unlock () from /lib/tls/i686/cmov/libpthread.so.0 #28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6 #29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6 #30 0x08122c57 in directory_files_internal_unwind (dh=161026626) at /home/tmp/emacs/src/dired.c:137 #31 0x08151a4f in unbind_to (count=528, value=137410761) at /home/tmp/emacs/src/eval.c:3337 #32 0x0812401e in file_name_completion (file=160401147, dirname=160401115, all_flag=1, ver_flag=0) at /home/tmp/emacs/src/dired.c:721 #33 0x0812447c in Ffile_name_all_completions (file=160401147, directory=160401115) at /home/tmp/emacs/src/dired.c:442 #34 0x0815312f in Ffuncall (nargs=3, args=0xbfabfb50) at /home/tmp/emacs/src/eval.c:2985 #35 0x0817c99e in Fbyte_code (bytestr=136162331, vector=136162356, maxdepth=32) at /home/tmp/emacs/src/bytecode.c:679 #36 0x081526d8 in Feval (form=136162317) at /home/tmp/emacs/src/eval.c:2319 #37 0x08154991 in internal_lisp_condition_case (var=137410761, bodyform=136162317, handlers=136162453) at /home/tmp/emacs/src/eval.c:1414 #38 0x0817db50 in Fbyte_code (bytestr=136162139, vector=136162156, maxdepth=48) at /home/tmp/emacs/src/bytecode.c:869 #39 0x08152b5a in funcall_lambda (fun=136162100, nargs=1, arg_vector=0xbfabfe94) at /home/tmp/emacs/src/eval.c:3169 #40 0x08152fb0 in Ffuncall (nargs=2, args=0xbfabfe90) at /home/tmp/emacs/src/eval.c:3028 #41 0x0817c99e in Fbyte_code (bytestr=149470715, vector=149471668, maxdepth=56) at /home/tmp/emacs/src/bytecode.c:679 #42 0x08152b5a in funcall_lambda (fun=149031548, nargs=4, arg_vector=0xbfabffc4) at /home/tmp/emacs/src/eval.c:3169 #43 0x08152fb0 in Ffuncall (nargs=5, args=0xbfabffc0) at /home/tmp/emacs/src/eval.c:3028 #44 0x0817c99e in Fbyte_code (bytestr=143938675, vector=143888644, maxdepth=40) at /home/tmp/emacs/src/bytecode.c:679 #45 0x08152b5a in funcall_lambda (fun=143876164, nargs=0, arg_vector=0xbfac00f4) at /home/tmp/emacs/src/eval.c:3169 #46 0x08152fb0 in Ffuncall (nargs=1, args=0xbfac00f0) at /home/tmp/emacs/src/eval.c:3028 #47 0x0817c99e in Fbyte_code (bytestr=144059891, vector=144063476, maxdepth=56) at /home/tmp/emacs/src/bytecode.c:679 #48 0x08152b5a in funcall_lambda (fun=144063812, nargs=1, arg_vector=0xbfac0224) at /home/tmp/emacs/src/eval.c:3169 #49 0x08152fb0 in Ffuncall (nargs=2, args
Re: Lockup
David Kastrup skrev: Please write in English if possible, because the Emacs maintainers usually do not have translators to read other languages for them. Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list. Please describe exactly what actions triggered the bug and the precise symptoms of the bug: Hi, I just had a lockup occuring. Here is a backtrace: Ouch, this is serious. I currently have no idea how to solve this, except change Emacs to use SYNC_INPUT. I'll have to think about this. Jan D. If emacs crashed, and you have the emacs process in the gdb debugger, please include the output from the following gdb commands: `bt full' and `xbacktrace'. If you would like to further debug the crash, please read the file /usr/local/emacs-21/share/emacs/22.0.50/etc/DEBUG for instructions. (gdb) bt #0 0xe410 in __kernel_vsyscall () #1 0xb79602ae in __lll_mutex_lock_wait () from /lib/tls/i686/cmov/libpthread.so.0 #2 0xb795cfc7 in _L_mutex_lock_159 () from /lib/tls/i686/cmov/libpthread.so.0 #3 0x0063 in ?? () #4 0xbfabe690 in ?? () #5 0x086a6d48 in ?? () #6 0xbfabe6d8 in ?? () #7 0x080c78f0 in handle_one_xevent (dpyinfo=0x82f375c, eventp=0xbfabdbec, finish=0xb79602ae, hold_quit=0x82f375c) at /home/tmp/emacs/src/xterm.c:6916 #8 0x0813b963 in emacs_blocked_malloc (size=4294967292, ptr=0xb7993042) at /home/tmp/emacs/src/alloc.c:1231 #9 0xb77a83c5 in malloc () from /lib/tls/i686/cmov/libc.so.6 #10 0xb7993042 in g_malloc () from /usr/lib/libglib-2.0.so.0 #11 0xb79a2e27 in g_strndup () from /usr/lib/libglib-2.0.so.0 #12 0xb7977fa2 in g_convert_with_fallback () from /usr/lib/libglib-2.0.so.0 #13 0xb79780f5 in g_locale_from_utf8 () from /usr/lib/libglib-2.0.so.0 #14 0xb7c84b5e in gdk_add_client_message_filter () from /usr/lib/libgdk-x11-2.0.so.0 #15 0xb7c855b2 in gdk_x11_register_standard_event_type () from /usr/lib/libgdk-x11-2.0.so.0 #16 0xb7c86c78 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0 #17 0xb7c86dc1 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0 #18 0xb798b8d6 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0 #19 0xb798e996 in g_main_context_check () from /usr/lib/libglib-2.0.so.0 #20 0xb798ee1e in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0 #21 0xb7de1f74 in gtk_main_iteration () from /usr/lib/libgtk-x11-2.0.so.0 #22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac) at /home/tmp/emacs/src/xterm.c:7067 #23 0x080f90dc in read_avail_input (expected=1) at /home/tmp/emacs/src/keyboard.c:6737 #24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883 #25 0x080f9317 in input_available_signal (signo=29) at /home/tmp/emacs/src/keyboard.c:6925 #26 signal handler called #27 0xb795d27d in pthread_mutex_unlock () from /lib/tls/i686/cmov/libpthread.so.0 #28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6 #29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6 #30 0x08122c57 in directory_files_internal_unwind (dh=161026626) at /home/tmp/emacs/src/dired.c:137 #31 0x08151a4f in unbind_to (count=528, value=137410761) at /home/tmp/emacs/src/eval.c:3337 #32 0x0812401e in file_name_completion (file=160401147, dirname=160401115, all_flag=1, ver_flag=0) at /home/tmp/emacs/src/dired.c:721 #33 0x0812447c in Ffile_name_all_completions (file=160401147, directory=160401115) at /home/tmp/emacs/src/dired.c:442 #34 0x0815312f in Ffuncall (nargs=3, args=0xbfabfb50) at /home/tmp/emacs/src/eval.c:2985 #35 0x0817c99e in Fbyte_code (bytestr=136162331, vector=136162356, maxdepth=32) at /home/tmp/emacs/src/bytecode.c:679 #36 0x081526d8 in Feval (form=136162317) at /home/tmp/emacs/src/eval.c:2319 #37 0x08154991 in internal_lisp_condition_case (var=137410761, bodyform=136162317, handlers=136162453) at /home/tmp/emacs/src/eval.c:1414 #38 0x0817db50 in Fbyte_code (bytestr=136162139, vector=136162156, maxdepth=48) at /home/tmp/emacs/src/bytecode.c:869 #39 0x08152b5a in funcall_lambda (fun=136162100, nargs=1, arg_vector=0xbfabfe94) at /home/tmp/emacs/src/eval.c:3169 #40 0x08152fb0 in Ffuncall (nargs=2, args=0xbfabfe90) at /home/tmp/emacs/src/eval.c:3028 #41 0x0817c99e in Fbyte_code (bytestr=149470715, vector=149471668, maxdepth=56) at /home/tmp/emacs/src/bytecode.c:679 #42 0x08152b5a in funcall_lambda (fun=149031548, nargs=4, arg_vector=0xbfabffc4) at /home/tmp/emacs/src/eval.c:3169 #43 0x08152fb0 in Ffuncall (nargs=5, args=0xbfabffc0) at /home/tmp/emacs/src/eval.c:3028 #44 0x0817c99e in Fbyte_code (bytestr=143938675, vector=143888644, maxdepth=40) at /home/tmp/emacs/src/bytecode.c:679 #45 0x08152b5a in funcall_lambda (fun=143876164, nargs=0, arg_vector=0xbfac00f4) at /home/tmp/emacs/src/eval.c:3169 #46 0x08152fb0 in Ffuncall (nargs=1, args=0xbfac00f0) at /home/tmp/emacs/src/eval.c:3028 #47 0x0817c99e in Fbyte_code (bytestr=144059891, vector=144063476, maxdepth=56) at /home/tmp/emacs