Am 25/11/2022 um 14:22 schrieb Dmitry Karasik:
URL: http://karasik.eu.org/misc/cygwin/

Dear all,

Here's some exception that is caused if gtk_settings_get_default() is called 
from a
dll and then later fork() call is made.  The bug is not observed if the call is
made in the main program, and neither is observed if the gtk initialization is
done but gtk_settings_get_default() is not called.

Warning: If you run ./dlload.exe without CYGWIN environment variable being set 
to
dumper that will terminate the process, your system will accumulate copies of
dlload.exe, zombie-like, which will eat CPU. strace says that these zombie
processes repeatedly hit exceptions in endless loops. The following strace
is repeated forever after the fork:

--- Process 9108 (pid: 10439), exception c0000005 at 00000003f5baa8e0
  1960   21097 [main] perl 10439 exception::handle: In cygwin_except_handler 
exception 0xC0000005 at 0x3F5BAA8E0 sp 0xFFFFC5A8
    16   21113 [main] perl 10439 exception::handle: In cygwin_except_handler 
signal 11 at 0x3F5BAA8E0
    14   21127 [main] perl 10439 try_to_debug: debugger_command 'dumper 
"./dlload.exe"'
    23   21150 [main] perl 10439 break_here: break here
    12   21162 [main] perl 10439 sig_send: sendsig 0x13C, pid 10439, signal 11, 
its_me 1
    14   21176 [main] perl 10439 sig_send: wakeup 0x3F4
    15   21191 [main] perl 10439 sig_send: Waiting for pack.wakeup 0x3F4
    19   21210 [sig] perl 10439 sigpacket::process: returning -1
    19   21229 [sig] perl 10439 wait_sig: signalling pack.wakeup 0x3F4
    17   21246 [main] perl 10439 sig_send: returning 0x0 from sending signal 11

I encountered this problem when I've seen random perl and python scripts 
hanging (as they were apparently waiting for
forked child that never ended), and when ^C-d, I notices the accumulation of 
the zombie processes.

The dumper's coredump doesn't show the culprit, but it does show this:
(gdb) bt
#0  0x00007ffa4870d744 in ntdll!ZwDelayExecution () from 
C:/WINDOWS/SYSTEM32/ntdll.dll
#1  0x00007ffa4601b03e in SleepEx () from C:/WINDOWS/System32/KERNELBASE.dll
#2  0x000000018006205a in try_to_debug () from C:/cygwin64/bin/cygwin1.dll
#3  0x00000001800624f6 in exception::handle(_EXCEPTION_RECORD*, void*, 
_CONTEXT*, _DISPATCHER_CONTEXT*) () from C:/cygwin64/bin/cygwin1.dll
#4  0x00007ffa4871241f in ntdll!.chkstk () from C:/WINDOWS/SYSTEM32/ntdll.dll
#5  0x00007ffa486c14a4 in ntdll!RtlRaiseException () from 
C:/WINDOWS/SYSTEM32/ntdll.dll
#6  0x00007ffa48710f4e in ntdll!KiUserExceptionDispatcher () from 
C:/WINDOWS/SYSTEM32/ntdll.dll
#7  0x00000003f5baa8e0 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

which seems to indicate that the exception is somewhere in cygwin runtime.  I
haven't got around to finding out where that bug in the runtime is exactly, as
I'd like to hear if there any smart strategies of doing that.

I neither succeed to reduce the gtk_settings_get_default() to something more
chewable (that call was actually most reduced), even though I recompiled gtk3
locally, but its strace strangely doesn't show anything suspicious, no forks,
no open sockets, no pipe calls, just file openings (see strace.gsettings).

Kindly advise how to proceed if I can help fixing this, so far I'm a bit stuck.
I had trouble with dlopen myself, until I found it cannot be nested if a library called uses dlopen itself.
In my case, it helped to add flags RTLD_LAZY | RTLD_GLOBAL to dlopen.


Otherwise, to reproduce, download and unpack 
http://karasik.eu.org/misc/cygwin/cygwin-gtk-dlopen-fork-bug.tar
and run ./try there.



--
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

Reply via email to