Re: Strange lockup with metacity

2006-11-30 Thread Jan Djärv



Chong Yidong skrev:

There was one controversial patch
applied to src/xterm.c at the beginning of November, which may have
some bearing on this.  This is just a hunch, but could you apply the
reversed patch and see if the problem persists?

Applying that patch fixed the problem. I'm much happier now - thanks!


My preceding email to Jon was not sent to this list, but the patch in
question was the _NET_ACTIVE_WINDOW hack discussed in the raise-frame
doesn't work in Fedora Core 4 thread on emacs-devel.  Apparently, the
hack has bad side-effects.


It is not a hack, it is following a Freedesktop specification.  We have to 
revisit the whole specification after the release and probably add a lot more 
of these _NET_* settings.  I'd say it is a bug in metacity (there are plenty 
already ...), but I've changed that bit in Emacs so we only send 
_NET_ACTIVATE_WINDOW on explicit raise-frame calls.


Jonathan, can you test a newer CVS?  Can you also check if raise-frame works 
on your old copy (i.e. the one that didn't hang metacity) of Emacs and the 
newer one?


Thanks,

Jan D.



___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Strange lockup with metacity

2006-11-30 Thread Jonathan Corbet
Jan Djärv [EMAIL PROTECTED] wrote:

 I'd say it is a bug in metacity (there are
 plenty already ...), but I've changed that bit in Emacs so we only send
 _NET_ACTIVATE_WINDOW on explicit raise-frame calls.

I have no trouble believing that it could be a metacity bug - but
nothing else triggers it.

 Jonathan, can you test a newer CVS?  Can you also check if raise-frame
 works on your old copy (i.e. the one that didn't hang metacity) of Emacs
 and the newer one?

Nope.  The behavior has changed, though.  So, to summarize:

  - With 22.0.91: an attempt to resize the window will lock up every
time.

  - 22.0.91 with the patch sent by Chong Yidong backed out: never locks
up.

  - With CVS grabbed in the morning (US/Mountain) of November 30: locks
up maybe one time in five - but still definitely locks up.  It is,
however, easier to unwedge: for whatever reason, switching to
another virtual console and back shakes things loose.

Hope that helps,

jon

Jonathan Corbet / LWN.net / [EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Strange lockup with metacity

2006-11-30 Thread Richard Stallman
I'm seeing a strange problem with 22.0.91.1 on an x86-64 Fedora rawhide
system (22.0.90 had it too).  Almost everything works great, but any
attempt to resize an emacs frame using the window manager locks things
up.  Essentially, metacity grabs the mouse then stops, waiting for
something; the only way to get my desktop back is to restart metacity
from somewhere else.

Can you compile metacity with debugging symbols?
You could run it under GDB, perhaps on another machine's console
so you can still type at GDB even when it is hung.


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Strange lockup with metacity

2006-11-30 Thread Richard Stallman
My preceding email to Jon was not sent to this list, but the patch in
question was the _NET_ACTIVE_WINDOW hack discussed in the raise-frame
doesn't work in Fedora Core 4 thread on emacs-devel.  Apparently, the
hack has bad side-effects.

If this causes serious problems on some systems, we have to take it out.
But please don't delete the code.  Please put #if 0 around it,
and add another comment explaining the particulars of this problem.
That might help us some time in the future figure out a better solution.


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Strange lockup with metacity

2006-11-30 Thread Jan Djärv



Jonathan Corbet skrev:

Jan Dj채rv [EMAIL PROTECTED] wrote:


I'd say it is a bug in metacity (there are
plenty already ...), but I've changed that bit in Emacs so we only send
_NET_ACTIVATE_WINDOW on explicit raise-frame calls.


I have no trouble believing that it could be a metacity bug - but
nothing else triggers it.


Jonathan, can you test a newer CVS?  Can you also check if raise-frame
works on your old copy (i.e. the one that didn't hang metacity) of Emacs
and the newer one?


Nope.  The behavior has changed, though.  So, to summarize:

  - With 22.0.91: an attempt to resize the window will lock up every
time.

  - 22.0.91 with the patch sent by Chong Yidong backed out: never locks
up.

  - With CVS grabbed in the morning (US/Mountain) of November 30: locks
up maybe one time in five - but still definitely locks up.  It is,
however, easier to unwedge: for whatever reason, switching to
another virtual console and back shakes things loose.

Hope that helps,


Can you find out what version of metacity you have?
If we can't fix this, we have to take out that code as Richard says.  Too many 
people are using metacity.  Funny, choosing between metacity bugs.  At least 
the raise-frame bug doesn't cause a hang.


Thanks,

Jan D.



___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Strange lockup with metacity

2006-11-29 Thread Jonathan Corbet
I'm seeing a strange problem with 22.0.91.1 on an x86-64 Fedora rawhide
system (22.0.90 had it too).  Almost everything works great, but any
attempt to resize an emacs frame using the window manager locks things
up.  Essentially, metacity grabs the mouse then stops, waiting for
something; the only way to get my desktop back is to restart metacity
from somewhere else.

This does not happen with any other application; it also does not happen
with emacs 21.  Clearly, emacs 22 pretest is doing something
differently, and it's creating weirdness.  I'm not sure how to try to
debug this, but thought I would toss it out there.  If there's further
information I could get to help track it down, let me know and I'll do
my best.

Thanks,

jon


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Strange lockup with metacity

2006-11-29 Thread Chong Yidong
Jonathan Corbet [EMAIL PROTECTED] writes:

 I'm seeing a strange problem with 22.0.91.1 on an x86-64 Fedora rawhide
 system (22.0.90 had it too).  Almost everything works great, but any
 attempt to resize an emacs frame using the window manager locks things
 up.  Essentially, metacity grabs the mouse then stops, waiting for
 something; the only way to get my desktop back is to restart metacity
 from somewhere else.

 This does not happen with any other application; it also does not happen
 with emacs 21.  Clearly, emacs 22 pretest is doing something
 differently, and it's creating weirdness.  I'm not sure how to try to
 debug this, but thought I would toss it out there.  If there's further
 information I could get to help track it down, let me know and I'll do
 my best.

It does not happen for me on Ubuntu Dapper (Metacity 2.14.5).  We need
more information: does it happen with `emacs -Q', and when Emacs is
compiled with/without GTK support?  Please provide the information
given using M-x report-emacs-bug RET.



___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Strange lockup with metacity

2006-11-29 Thread Jonathan Corbet
Chong Yidong [EMAIL PROTECTED] wrote:

 There was one controversial patch
 applied to src/xterm.c at the beginning of November, which may have
 some bearing on this.  This is just a hunch, but could you apply the
 reversed patch and see if the problem persists?

Applying that patch fixed the problem. I'm much happier now - thanks!

jon

Jonathan Corbet / LWN.net / [EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-11 Thread Jan Djärv



YAMAMOTO Mitsuharu skrev:

On Thu, 10 Aug 2006 13:17:03 +0200, Jan Djärv [EMAIL PROTECTED] said:



My intention was that the above scenario would be avoided with
BLOCK_INPUT around functions that may call malloc-related
functions.



It does not help if the calling thread is one of the Gnoem threads.


But a signal delivered to a non-main thread is redirected to the main
thread by SIGNAL_THREAD_CHECK.


A signal yes, but I was thinking of this scenario:

A Gnome thread does malloc, gets the mutex lock and enters the malloc code.
A signal is delivered (in the main thread as you point out) and enters malloc 
also.  This situation is exactly like the one with the lockup, but here we 
can't use BLOCK_INPUT around the malloc related functions because they are in 
the Gnome code.





How about just changing the order of lock/unlock and
BLOCK_INPUT/UNBLOCK_INPUT in the previous version of
BLOCK_INPUT_ALLOC/UNBLOCK_INPUT_ALLOC?



That would mean that lock/unlock mutex functions are called in the
signal handler context, which is not allowed according to the
documentation.


Yes, pthread_mutex_(un)lock is not async-signal-safe.  But we are
already using such functions as malloc in the signal handler context
(with the help of BLOCK_INPUT).  I guess calling
pthread_mutex_(un)lock in the signal handler context is safe in
reality unless the interrupted thread is also executing
pthread_mutex_(un)lock for the same mutex.  I think it's better than
the current one, i.e., not protecting shared resources such as
__malloc_hook in the signal handler context.


I agree with your assumtion that the lockuo occurs because the signal handler 
and the interrupted therad are calling pthread_mutex_(un)lock for the same 
mutex.  But BLOCK_INPUT does not help, because Gnome code does not have it.


So I tried to do the next best thing, i.e. block SIGIO in non-main threads. 
The problem with this is that I can't block SIGIO before taking the mutex, 
because if I hang when taking the mutex, SIGIO would remain blocked.  One 
could use trylock and some sort of busy loop, but I don't think that is usable.



(Of course SYNC_INPUT is the right direction, but the current plan is
not enabling it in the next release as far as I understand.)


Unless someone comes up with a supersafe scheme I think we have to live with 
this race condition until then.  But it is better now than before, SIGIO and 
the main thread executing the sam (un)lock should not lockup.  But if the 
signal handler is executing in one thread one one processor and a Gnome thread 
is executing on another processor, there could be a lockup.


Jan D.



___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-11 Thread David Kastrup
YAMAMOTO Mitsuharu [EMAIL PROTECTED] writes:

 Yes, pthread_mutex_(un)lock is not async-signal-safe.  But we are
 already using such functions as malloc in the signal handler context
 (with the help of BLOCK_INPUT).  I guess calling
 pthread_mutex_(un)lock in the signal handler context is safe in
 reality unless the interrupted thread is also executing
 pthread_mutex_(un)lock for the same mutex.

I guess ... safe in reality unless ...

Maybe I have been around programmers too long, but I don't find this
exactly reassuring.

-- 
David Kastrup, Kriemhildstr. 15, 44793 Bochum


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-11 Thread YAMAMOTO Mitsuharu
 On Fri, 11 Aug 2006 08:36:39 +0200, Jan Djärv [EMAIL PROTECTED] said:

 A signal yes, but I was thinking of this scenario:

 A Gnome thread does malloc, gets the mutex lock and enters the
 malloc code.  A signal is delivered (in the main thread as you point
 out) and enters malloc also.  This situation is exactly like the one
 with the lockup, but here we can't use BLOCK_INPUT around the malloc
 related functions because they are in the Gnome code.

I think such a case just behaves like a usual mutual exclusion between
multiple threads: one thread acquires a mutex, and the other blocks
until it is released.

 I agree with your assumtion that the lockuo occurs because the
 signal handler and the interrupted therad are calling
 pthread_mutex_(un)lock for the same mutex.  But BLOCK_INPUT does not
 help, because Gnome code does not have it.

That's not a problem because Gnome threads (non-main threads) never
execute pthread_mutex_(un)lock in the signal hander context.

 YAMAMOTO Mitsuharu
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-11 Thread YAMAMOTO Mitsuharu
 On Fri, 11 Aug 2006 09:04:34 +0200, David Kastrup [EMAIL PROTECTED] 
 said:

 YAMAMOTO Mitsuharu [EMAIL PROTECTED] writes:
 Yes, pthread_mutex_(un)lock is not async-signal-safe.  But we are
 already using such functions as malloc in the signal handler
 context (with the help of BLOCK_INPUT).  I guess calling
 pthread_mutex_(un)lock in the signal handler context is safe in
 reality unless the interrupted thread is also executing
 pthread_mutex_(un)lock for the same mutex.

 I guess ... safe in reality unless ...

 Maybe I have been around programmers too long, but I don't find this
 exactly reassuring.

Yeah.  IEEE Std 1003.1 provides a table of async-signal-safe functions
(those can be called safely within a signal handler), and neither
malloc nor pthread_mutex_(un)lock is such functions.

  All functions not in the above table are considered to be unsafe
  with respect to signals. In the presence of signals, all functions
  defined by this volume of IEEE Std 1003.1-2001 shall behave as
  defined when called from or interrupted by a signal-catching
  function, with a single exception: when a signal interrupts an
  unsafe function and the signal-catching function calls an unsafe
  function, the behavior is undefined.

(http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html)

We already have malloc calls in the signal handler context with the
assumption that it is safe to call unless the signal interrupts
malloc-related functions.  So I think it's not that bad to also put
reasonable assumptions to pthread_mutex_(un)lock.

 YAMAMOTO Mitsuharu
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-11 Thread YAMAMOTO Mitsuharu
 On Fri, 11 Aug 2006 10:09:49 +0200, Jan Djärv [EMAIL PROTECTED] said:

 That's not a problem because Gnome threads (non-main threads) never
 execute pthread_mutex_(un)lock in the signal hander context.

 That does not help, the main thread executes in signal handler
 context sometimes.  And in that case, both the Gnome thread and the
 signal handler may be executing (un)lock_mutex on the same mutex.

I don't think this causes a problem.  The signal handler is executed
in the main thread that is different from the Gnome thread.  And as I
quoted from IEEE Std 1003.1 in another message, a
pthread_mutex_(un)lock call in the signal hander context should work
as usual unless the signal interrupted an unsafe function.

The condition unless the signal interrupted an unsafe function is
too strict in reality.  My guess was that it could be relaxed to
unless the signal interrupted a pthread_mutex_(un)lock call for the
same mutex.

 YAMAMOTO Mitsuharu
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-10 Thread Jan Djärv



David Kastrup skrev:


Hi, I just had a lockup occuring.  Here is a backtrace:


I've checked in a fix, but I beleive the race condition still exists on 
multiprocessor machines.  I can't see a way to fix that except move to SYNC_INPUT.


Jan D.



If emacs crashed, and you have the emacs process in the gdb debugger,
please include the output from the following gdb commands:
`bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/usr/local/emacs-21/share/emacs/22.0.50/etc/DEBUG for instructions.

(gdb) bt
#0  0xe410 in __kernel_vsyscall ()
#1  0xb79602ae in __lll_mutex_lock_wait ()
   from /lib/tls/i686/cmov/libpthread.so.0
#2  0xb795cfc7 in _L_mutex_lock_159 () from /lib/tls/i686/cmov/libpthread.so.0
#3  0x0063 in ?? ()
#4  0xbfabe690 in ?? ()
#5  0x086a6d48 in ?? ()
#6  0xbfabe6d8 in ?? ()
#7  0x080c78f0 in handle_one_xevent (dpyinfo=0x82f375c, eventp=0xbfabdbec,
finish=0xb79602ae, hold_quit=0x82f375c) at /home/tmp/emacs/src/xterm.c:6916
#8  0x0813b963 in emacs_blocked_malloc (size=4294967292, ptr=0xb7993042)
at /home/tmp/emacs/src/alloc.c:1231
#9  0xb77a83c5 in malloc () from /lib/tls/i686/cmov/libc.so.6
#10 0xb7993042 in g_malloc () from /usr/lib/libglib-2.0.so.0
#11 0xb79a2e27 in g_strndup () from /usr/lib/libglib-2.0.so.0
#12 0xb7977fa2 in g_convert_with_fallback () from /usr/lib/libglib-2.0.so.0
#13 0xb79780f5 in g_locale_from_utf8 () from /usr/lib/libglib-2.0.so.0
#14 0xb7c84b5e in gdk_add_client_message_filter ()
   from /usr/lib/libgdk-x11-2.0.so.0
#15 0xb7c855b2 in gdk_x11_register_standard_event_type ()
   from /usr/lib/libgdk-x11-2.0.so.0
#16 0xb7c86c78 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0
#17 0xb7c86dc1 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0
#18 0xb798b8d6 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#19 0xb798e996 in g_main_context_check () from /usr/lib/libglib-2.0.so.0
#20 0xb798ee1e in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#21 0xb7de1f74 in gtk_main_iteration () from /usr/lib/libgtk-x11-2.0.so.0
#22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac)
at /home/tmp/emacs/src/xterm.c:7067
#23 0x080f90dc in read_avail_input (expected=1)
at /home/tmp/emacs/src/keyboard.c:6737
#24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883
#25 0x080f9317 in input_available_signal (signo=29)
at /home/tmp/emacs/src/keyboard.c:6925
#26 signal handler called
#27 0xb795d27d in pthread_mutex_unlock ()
   from /lib/tls/i686/cmov/libpthread.so.0
#28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6
#29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6
#30 0x08122c57 in directory_files_internal_unwind (dh=161026626)
at /home/tmp/emacs/src/dired.c:137
#31 0x08151a4f in unbind_to (count=528, value=137410761)
at /home/tmp/emacs/src/eval.c:3337
#32 0x0812401e in file_name_completion (file=160401147, dirname=160401115,
all_flag=1, ver_flag=0) at /home/tmp/emacs/src/dired.c:721
#33 0x0812447c in Ffile_name_all_completions (file=160401147,
directory=160401115) at /home/tmp/emacs/src/dired.c:442
#34 0x0815312f in Ffuncall (nargs=3, args=0xbfabfb50)
at /home/tmp/emacs/src/eval.c:2985
#35 0x0817c99e in Fbyte_code (bytestr=136162331, vector=136162356, maxdepth=32)
at /home/tmp/emacs/src/bytecode.c:679
#36 0x081526d8 in Feval (form=136162317) at /home/tmp/emacs/src/eval.c:2319
#37 0x08154991 in internal_lisp_condition_case (var=137410761,
bodyform=136162317, handlers=136162453) at /home/tmp/emacs/src/eval.c:1414
#38 0x0817db50 in Fbyte_code (bytestr=136162139, vector=136162156, maxdepth=48)
at /home/tmp/emacs/src/bytecode.c:869
#39 0x08152b5a in funcall_lambda (fun=136162100, nargs=1,
arg_vector=0xbfabfe94) at /home/tmp/emacs/src/eval.c:3169
#40 0x08152fb0 in Ffuncall (nargs=2, args=0xbfabfe90)
at /home/tmp/emacs/src/eval.c:3028
#41 0x0817c99e in Fbyte_code (bytestr=149470715, vector=149471668, maxdepth=56)
at /home/tmp/emacs/src/bytecode.c:679
#42 0x08152b5a in funcall_lambda (fun=149031548, nargs=4,
arg_vector=0xbfabffc4) at /home/tmp/emacs/src/eval.c:3169
#43 0x08152fb0 in Ffuncall (nargs=5, args=0xbfabffc0)
at /home/tmp/emacs/src/eval.c:3028
#44 0x0817c99e in Fbyte_code (bytestr=143938675, vector=143888644, maxdepth=40)
at /home/tmp/emacs/src/bytecode.c:679
#45 0x08152b5a in funcall_lambda (fun=143876164, nargs=0,
arg_vector=0xbfac00f4) at /home/tmp/emacs/src/eval.c:3169
#46 0x08152fb0 in Ffuncall (nargs=1, args=0xbfac00f0)
at /home/tmp/emacs/src/eval.c:3028
#47 0x0817c99e in Fbyte_code (bytestr=144059891, vector=144063476, maxdepth=56)
at /home/tmp/emacs/src/bytecode.c:679
#48 0x08152b5a in funcall_lambda (fun=144063812, nargs=1,
arg_vector=0xbfac0224) at /home/tmp/emacs/src/eval.c:3169
#49 0x08152fb0 in Ffuncall (nargs=2, args=0xbfac0220)
at /home/tmp/emacs/src/eval.c:3028
#50 0x0817c99e in Fbyte_code (bytestr=144059651, vector

Re: Lockup

2006-08-10 Thread YAMAMOTO Mitsuharu
 On Thu, 10 Aug 2006 08:20:24 +0200, Jan Djärv [EMAIL PROTECTED] said:

 Hi, I just had a lockup occuring.  Here is a backtrace:

 I've checked in a fix, but I beleive the race condition still exists
 on multiprocessor machines.  I can't see a way to fix that except
 move to SYNC_INPUT.

Maybe I'm missing something, but doesn't adding BLOCK_INPUT around
closedir (and opendir) help?

 YAMAMOTO Mitsuharu
[EMAIL PROTECTED]

 #22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac)
 at /home/tmp/emacs/src/xterm.c:7067
 #23 0x080f90dc in read_avail_input (expected=1)
 at /home/tmp/emacs/src/keyboard.c:6737
 #24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883
 #25 0x080f9317 in input_available_signal (signo=29)
 at /home/tmp/emacs/src/keyboard.c:6925
 #26 signal handler called
 #27 0xb795d27d in pthread_mutex_unlock ()
from /lib/tls/i686/cmov/libpthread.so.0
 #28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6
 #29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6
 #30 0x08122c57 in directory_files_internal_unwind (dh=161026626)
 at /home/tmp/emacs/src/dired.c:137


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-10 Thread Jan Djärv



YAMAMOTO Mitsuharu skrev:

On Thu, 10 Aug 2006 08:20:24 +0200, Jan Djärv [EMAIL PROTECTED] said:



Hi, I just had a lockup occuring.  Here is a backtrace:



I've checked in a fix, but I beleive the race condition still exists
on multiprocessor machines.  I can't see a way to fix that except
move to SYNC_INPUT.


Maybe I'm missing something, but doesn't adding BLOCK_INPUT around
closedir (and opendir) help?



In this particular case it would help, but in general the problem is that the 
signal handler gets called when the main thread is executing in the mutex code 
(pthread_mutex_unlock).  Later when the signal handler tries to get the mutex, 
it locks up, it is actually not allowed to call mutex functions from a signal 
handler.  The mutex is there to protect from other (Gnome) threads that also 
call malloc/free.


But if a Gnome thread has the mutex and before it has blocked signals, the 
signal handler is run in parallell on another processor, there will be 
problems.  If we move to SYNC_INPUT there will be no malloc/free called in the 
signal handler and we only need the mutex to hamdle concurrent access.


Jan D.



#22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac)
at /home/tmp/emacs/src/xterm.c:7067
#23 0x080f90dc in read_avail_input (expected=1)
at /home/tmp/emacs/src/keyboard.c:6737
#24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883
#25 0x080f9317 in input_available_signal (signo=29)
at /home/tmp/emacs/src/keyboard.c:6925
#26 signal handler called
#27 0xb795d27d in pthread_mutex_unlock ()
   from /lib/tls/i686/cmov/libpthread.so.0
#28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6
#29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6
#30 0x08122c57 in directory_files_internal_unwind (dh=161026626)
at /home/tmp/emacs/src/dired.c:137





___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-10 Thread YAMAMOTO Mitsuharu
 On Thu, 10 Aug 2006 10:11:55 +0200, Jan Djärv [EMAIL PROTECTED] said:

 Maybe I'm missing something, but doesn't adding BLOCK_INPUT around
 closedir (and opendir) help?

 In this particular case it would help, but in general the problem is
 that the signal handler gets called when the main thread is
 executing in the mutex code (pthread_mutex_unlock).  Later when the
 signal handler tries to get the mutex, it locks up,

My intention was that the above scenario would be avoided with
BLOCK_INPUT around functions that may call malloc-related functions.

 it is actually not allowed to call mutex functions from a signal
 handler.  The mutex is there to protect from other (Gnome) threads
 that also call malloc/free.

 But if a Gnome thread has the mutex and before it has blocked
 signals, the signal handler is run in parallell on another
 processor, there will be problems.  If we move to SYNC_INPUT there
 will be no malloc/free called in the signal handler and we only need
 the mutex to hamdle concurrent access.

The current version would cause such a problem because now
BLOCK_INPUT_ALLOC and UNBLOCK_INPUT_ALLOC are no-op in a signal
handler.

How about just changing the order of lock/unlock and
BLOCK_INPUT/UNBLOCK_INPUT in the previous version of
BLOCK_INPUT_ALLOC/UNBLOCK_INPUT_ALLOC?

#define BLOCK_INPUT_ALLOC   \
  do\
{   \
  if (pthread_self () == main_thread)   \
BLOCK_INPUT;\
  pthread_mutex_lock (alloc_mutex);\
}   \
  while (0)
#define UNBLOCK_INPUT_ALLOC \
  do\
{   \
  pthread_mutex_unlock (alloc_mutex);  \
  if (pthread_self () == main_thread)   \
UNBLOCK_INPUT;  \
}   \
  while (0)

 YAMAMOTO Mitsuharu
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-10 Thread Jan Djärv



YAMAMOTO Mitsuharu skrev:

On Thu, 10 Aug 2006 10:11:55 +0200, Jan Djärv [EMAIL PROTECTED] said:





In this particular case it would help, but in general the problem is
that the signal handler gets called when the main thread is
executing in the mutex code (pthread_mutex_unlock).  Later when the
signal handler tries to get the mutex, it locks up,


My intention was that the above scenario would be avoided with
BLOCK_INPUT around functions that may call malloc-related functions.


It does not help if the calling thread is one of the Gnoem threads.



How about just changing the order of lock/unlock and
BLOCK_INPUT/UNBLOCK_INPUT in the previous version of
BLOCK_INPUT_ALLOC/UNBLOCK_INPUT_ALLOC?

#define BLOCK_INPUT_ALLOC   \
  do\
{   \
  if (pthread_self () == main_thread)   \
BLOCK_INPUT;\
  pthread_mutex_lock (alloc_mutex);\
}   \
  while (0)
#define UNBLOCK_INPUT_ALLOC \
  do\
{   \
  pthread_mutex_unlock (alloc_mutex);  \
  if (pthread_self () == main_thread)   \
UNBLOCK_INPUT;  \
}   \
  while (0)



That would mean that lock/unlock mutex functions are called in the signal 
handler context, which is not allowed according to the documentation.


Jan D.



___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Re: Lockup

2006-08-10 Thread YAMAMOTO Mitsuharu
 On Thu, 10 Aug 2006 13:17:03 +0200, Jan Djärv [EMAIL PROTECTED] said:

 My intention was that the above scenario would be avoided with
 BLOCK_INPUT around functions that may call malloc-related
 functions.

 It does not help if the calling thread is one of the Gnoem threads.

But a signal delivered to a non-main thread is redirected to the main
thread by SIGNAL_THREAD_CHECK.

 How about just changing the order of lock/unlock and
 BLOCK_INPUT/UNBLOCK_INPUT in the previous version of
 BLOCK_INPUT_ALLOC/UNBLOCK_INPUT_ALLOC?

 That would mean that lock/unlock mutex functions are called in the
 signal handler context, which is not allowed according to the
 documentation.

Yes, pthread_mutex_(un)lock is not async-signal-safe.  But we are
already using such functions as malloc in the signal handler context
(with the help of BLOCK_INPUT).  I guess calling
pthread_mutex_(un)lock in the signal handler context is safe in
reality unless the interrupted thread is also executing
pthread_mutex_(un)lock for the same mutex.  I think it's better than
the current one, i.e., not protecting shared resources such as
__malloc_hook in the signal handler context.

(Of course SYNC_INPUT is the right direction, but the current plan is
not enabling it in the next release as far as I understand.)

 YAMAMOTO Mitsuharu
[EMAIL PROTECTED]


___
emacs-pretest-bug mailing list
emacs-pretest-bug@gnu.org
http://lists.gnu.org/mailman/listinfo/emacs-pretest-bug


Lockup

2006-08-01 Thread David Kastrup

Please write in English if possible, because the Emacs maintainers
usually do not have translators to read other languages for them.

Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list.

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

Hi, I just had a lockup occuring.  Here is a backtrace:

If emacs crashed, and you have the emacs process in the gdb debugger,
please include the output from the following gdb commands:
`bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/usr/local/emacs-21/share/emacs/22.0.50/etc/DEBUG for instructions.

(gdb) bt
#0  0xe410 in __kernel_vsyscall ()
#1  0xb79602ae in __lll_mutex_lock_wait ()
   from /lib/tls/i686/cmov/libpthread.so.0
#2  0xb795cfc7 in _L_mutex_lock_159 () from /lib/tls/i686/cmov/libpthread.so.0
#3  0x0063 in ?? ()
#4  0xbfabe690 in ?? ()
#5  0x086a6d48 in ?? ()
#6  0xbfabe6d8 in ?? ()
#7  0x080c78f0 in handle_one_xevent (dpyinfo=0x82f375c, eventp=0xbfabdbec,
finish=0xb79602ae, hold_quit=0x82f375c) at /home/tmp/emacs/src/xterm.c:6916
#8  0x0813b963 in emacs_blocked_malloc (size=4294967292, ptr=0xb7993042)
at /home/tmp/emacs/src/alloc.c:1231
#9  0xb77a83c5 in malloc () from /lib/tls/i686/cmov/libc.so.6
#10 0xb7993042 in g_malloc () from /usr/lib/libglib-2.0.so.0
#11 0xb79a2e27 in g_strndup () from /usr/lib/libglib-2.0.so.0
#12 0xb7977fa2 in g_convert_with_fallback () from /usr/lib/libglib-2.0.so.0
#13 0xb79780f5 in g_locale_from_utf8 () from /usr/lib/libglib-2.0.so.0
#14 0xb7c84b5e in gdk_add_client_message_filter ()
   from /usr/lib/libgdk-x11-2.0.so.0
#15 0xb7c855b2 in gdk_x11_register_standard_event_type ()
   from /usr/lib/libgdk-x11-2.0.so.0
#16 0xb7c86c78 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0
#17 0xb7c86dc1 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0
#18 0xb798b8d6 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#19 0xb798e996 in g_main_context_check () from /usr/lib/libglib-2.0.so.0
#20 0xb798ee1e in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#21 0xb7de1f74 in gtk_main_iteration () from /usr/lib/libgtk-x11-2.0.so.0
#22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac)
at /home/tmp/emacs/src/xterm.c:7067
#23 0x080f90dc in read_avail_input (expected=1)
at /home/tmp/emacs/src/keyboard.c:6737
#24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883
#25 0x080f9317 in input_available_signal (signo=29)
at /home/tmp/emacs/src/keyboard.c:6925
#26 signal handler called
#27 0xb795d27d in pthread_mutex_unlock ()
   from /lib/tls/i686/cmov/libpthread.so.0
#28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6
#29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6
#30 0x08122c57 in directory_files_internal_unwind (dh=161026626)
at /home/tmp/emacs/src/dired.c:137
#31 0x08151a4f in unbind_to (count=528, value=137410761)
at /home/tmp/emacs/src/eval.c:3337
#32 0x0812401e in file_name_completion (file=160401147, dirname=160401115,
all_flag=1, ver_flag=0) at /home/tmp/emacs/src/dired.c:721
#33 0x0812447c in Ffile_name_all_completions (file=160401147,
directory=160401115) at /home/tmp/emacs/src/dired.c:442
#34 0x0815312f in Ffuncall (nargs=3, args=0xbfabfb50)
at /home/tmp/emacs/src/eval.c:2985
#35 0x0817c99e in Fbyte_code (bytestr=136162331, vector=136162356, maxdepth=32)
at /home/tmp/emacs/src/bytecode.c:679
#36 0x081526d8 in Feval (form=136162317) at /home/tmp/emacs/src/eval.c:2319
#37 0x08154991 in internal_lisp_condition_case (var=137410761,
bodyform=136162317, handlers=136162453) at /home/tmp/emacs/src/eval.c:1414
#38 0x0817db50 in Fbyte_code (bytestr=136162139, vector=136162156, maxdepth=48)
at /home/tmp/emacs/src/bytecode.c:869
#39 0x08152b5a in funcall_lambda (fun=136162100, nargs=1,
arg_vector=0xbfabfe94) at /home/tmp/emacs/src/eval.c:3169
#40 0x08152fb0 in Ffuncall (nargs=2, args=0xbfabfe90)
at /home/tmp/emacs/src/eval.c:3028
#41 0x0817c99e in Fbyte_code (bytestr=149470715, vector=149471668, maxdepth=56)
at /home/tmp/emacs/src/bytecode.c:679
#42 0x08152b5a in funcall_lambda (fun=149031548, nargs=4,
arg_vector=0xbfabffc4) at /home/tmp/emacs/src/eval.c:3169
#43 0x08152fb0 in Ffuncall (nargs=5, args=0xbfabffc0)
at /home/tmp/emacs/src/eval.c:3028
#44 0x0817c99e in Fbyte_code (bytestr=143938675, vector=143888644, maxdepth=40)
at /home/tmp/emacs/src/bytecode.c:679
#45 0x08152b5a in funcall_lambda (fun=143876164, nargs=0,
arg_vector=0xbfac00f4) at /home/tmp/emacs/src/eval.c:3169
#46 0x08152fb0 in Ffuncall (nargs=1, args=0xbfac00f0)
at /home/tmp/emacs/src/eval.c:3028
#47 0x0817c99e in Fbyte_code (bytestr=144059891, vector=144063476, maxdepth=56)
at /home/tmp/emacs/src/bytecode.c:679
#48 0x08152b5a in funcall_lambda (fun=144063812, nargs=1,
arg_vector=0xbfac0224) at /home/tmp/emacs/src/eval.c:3169
#49 0x08152fb0 in Ffuncall (nargs=2, args

Re: Lockup

2006-08-01 Thread Jan Djärv



David Kastrup skrev:

Please write in English if possible, because the Emacs maintainers
usually do not have translators to read other languages for them.

Your bug report will be posted to the emacs-pretest-bug@gnu.org mailing list.

Please describe exactly what actions triggered the bug
and the precise symptoms of the bug:

Hi, I just had a lockup occuring.  Here is a backtrace:


Ouch, this is serious.  I currently have no idea how to solve this, except 
change Emacs to use SYNC_INPUT.  I'll have to think about this.


Jan D.



If emacs crashed, and you have the emacs process in the gdb debugger,
please include the output from the following gdb commands:
`bt full' and `xbacktrace'.
If you would like to further debug the crash, please read the file
/usr/local/emacs-21/share/emacs/22.0.50/etc/DEBUG for instructions.

(gdb) bt
#0  0xe410 in __kernel_vsyscall ()
#1  0xb79602ae in __lll_mutex_lock_wait ()
   from /lib/tls/i686/cmov/libpthread.so.0
#2  0xb795cfc7 in _L_mutex_lock_159 () from /lib/tls/i686/cmov/libpthread.so.0
#3  0x0063 in ?? ()
#4  0xbfabe690 in ?? ()
#5  0x086a6d48 in ?? ()
#6  0xbfabe6d8 in ?? ()
#7  0x080c78f0 in handle_one_xevent (dpyinfo=0x82f375c, eventp=0xbfabdbec,
finish=0xb79602ae, hold_quit=0x82f375c) at /home/tmp/emacs/src/xterm.c:6916
#8  0x0813b963 in emacs_blocked_malloc (size=4294967292, ptr=0xb7993042)
at /home/tmp/emacs/src/alloc.c:1231
#9  0xb77a83c5 in malloc () from /lib/tls/i686/cmov/libc.so.6
#10 0xb7993042 in g_malloc () from /usr/lib/libglib-2.0.so.0
#11 0xb79a2e27 in g_strndup () from /usr/lib/libglib-2.0.so.0
#12 0xb7977fa2 in g_convert_with_fallback () from /usr/lib/libglib-2.0.so.0
#13 0xb79780f5 in g_locale_from_utf8 () from /usr/lib/libglib-2.0.so.0
#14 0xb7c84b5e in gdk_add_client_message_filter ()
   from /usr/lib/libgdk-x11-2.0.so.0
#15 0xb7c855b2 in gdk_x11_register_standard_event_type ()
   from /usr/lib/libgdk-x11-2.0.so.0
#16 0xb7c86c78 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0
#17 0xb7c86dc1 in _gdk_events_queue () from /usr/lib/libgdk-x11-2.0.so.0
#18 0xb798b8d6 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#19 0xb798e996 in g_main_context_check () from /usr/lib/libglib-2.0.so.0
#20 0xb798ee1e in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#21 0xb7de1f74 in gtk_main_iteration () from /usr/lib/libgtk-x11-2.0.so.0
#22 0x080cac12 in XTread_socket (sd=0, expected=1, hold_quit=0xbfabf5ac)
at /home/tmp/emacs/src/xterm.c:7067
#23 0x080f90dc in read_avail_input (expected=1)
at /home/tmp/emacs/src/keyboard.c:6737
#24 0x080f9283 in handle_async_input () at /home/tmp/emacs/src/keyboard.c:6883
#25 0x080f9317 in input_available_signal (signo=29)
at /home/tmp/emacs/src/keyboard.c:6925
#26 signal handler called
#27 0xb795d27d in pthread_mutex_unlock ()
   from /lib/tls/i686/cmov/libpthread.so.0
#28 0xb77a62f5 in free () from /lib/tls/i686/cmov/libc.so.6
#29 0xb77caa08 in closedir () from /lib/tls/i686/cmov/libc.so.6
#30 0x08122c57 in directory_files_internal_unwind (dh=161026626)
at /home/tmp/emacs/src/dired.c:137
#31 0x08151a4f in unbind_to (count=528, value=137410761)
at /home/tmp/emacs/src/eval.c:3337
#32 0x0812401e in file_name_completion (file=160401147, dirname=160401115,
all_flag=1, ver_flag=0) at /home/tmp/emacs/src/dired.c:721
#33 0x0812447c in Ffile_name_all_completions (file=160401147,
directory=160401115) at /home/tmp/emacs/src/dired.c:442
#34 0x0815312f in Ffuncall (nargs=3, args=0xbfabfb50)
at /home/tmp/emacs/src/eval.c:2985
#35 0x0817c99e in Fbyte_code (bytestr=136162331, vector=136162356, maxdepth=32)
at /home/tmp/emacs/src/bytecode.c:679
#36 0x081526d8 in Feval (form=136162317) at /home/tmp/emacs/src/eval.c:2319
#37 0x08154991 in internal_lisp_condition_case (var=137410761,
bodyform=136162317, handlers=136162453) at /home/tmp/emacs/src/eval.c:1414
#38 0x0817db50 in Fbyte_code (bytestr=136162139, vector=136162156, maxdepth=48)
at /home/tmp/emacs/src/bytecode.c:869
#39 0x08152b5a in funcall_lambda (fun=136162100, nargs=1,
arg_vector=0xbfabfe94) at /home/tmp/emacs/src/eval.c:3169
#40 0x08152fb0 in Ffuncall (nargs=2, args=0xbfabfe90)
at /home/tmp/emacs/src/eval.c:3028
#41 0x0817c99e in Fbyte_code (bytestr=149470715, vector=149471668, maxdepth=56)
at /home/tmp/emacs/src/bytecode.c:679
#42 0x08152b5a in funcall_lambda (fun=149031548, nargs=4,
arg_vector=0xbfabffc4) at /home/tmp/emacs/src/eval.c:3169
#43 0x08152fb0 in Ffuncall (nargs=5, args=0xbfabffc0)
at /home/tmp/emacs/src/eval.c:3028
#44 0x0817c99e in Fbyte_code (bytestr=143938675, vector=143888644, maxdepth=40)
at /home/tmp/emacs/src/bytecode.c:679
#45 0x08152b5a in funcall_lambda (fun=143876164, nargs=0,
arg_vector=0xbfac00f4) at /home/tmp/emacs/src/eval.c:3169
#46 0x08152fb0 in Ffuncall (nargs=1, args=0xbfac00f0)
at /home/tmp/emacs/src/eval.c:3028
#47 0x0817c99e in Fbyte_code (bytestr=144059891, vector=144063476, maxdepth=56)
at /home/tmp/emacs