Internationalization issue - string processing
We have a Java program that launches Cygwin bash processes which in turn run a script. The LC_ALL variable is set to "ja_JP". The script will execute processes using Unicode strings that are specified like this: "\u3053" (for the Hiragana letter Ko). For some reason, when bash calls another program and passes the string above to it, the string is being converted to "0x3f 0x3f". The script that is being run contains the following command: perl dump.pl "\u3053" The perl script just prints out the hex values of its arguments, and it displays: ?? 3f 3f The behavior is not reproducible if we run bash from a CMD prompt. I know this is pretty open-ended but are there any ideas as to what might be causing this sort of localization issue? Ernie Coskrey SIOS Technology Corp. -- Problem reports: http://cygwin.com/problems.html FAQ: http://cygwin.com/faq/ Documentation: http://cygwin.com/docs.html Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
RE: cygwin 1.5.20-1, spinning pdksh, 100% CPU
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Ernie Coskrey > Sent: Wednesday, August 08, 2007 2:11 PM > To: cygwin@cygwin.com > Subject: RE: cygwin 1.5.20-1, spinning pdksh, 100% CPU > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Ernie Coskrey > > Sent: Tuesday, July 31, 2007 3:40 PM > > To: cygwin@cygwin.com > > Subject: cygwin 1.5.20-1, spinning pdksh, 100% CPU > > > > > > I've run into a problem with cygwin 1.5.20-1 and pdksh > 5.2.14. We've > > got a pdksh.exe process that is spinning, using all the CPU. > > > > This scenario is very hard to reproduce, but has happened > on our test > > systems occasionally. It occurred recently, and I > currently have gdb > > attached to the process and have the symbols loaded. I see > that pdksh > > is continually calling "sigsuspend()", which is immediately > returning > > from cancelable_wait due to the fact that the > signal_arrived event is > > set. I also see that pdksh is waiting for a subprocess to > complete, > > and has a handle to the PID of that process - however the > process has > > long since terminated. > > > > It appears that something went wrong during delivery of SIGCHLD. > > > > I've got two questions related to this: > > > > - have there been changes between 1.5.20-1 and 1.5.24-2, or > the latest > > snapshot, that might have fixed this issue? We've done > some limited > > testing with 1.5.24-2 and haven't seen this happen yet, but > as I said > > the it only happens rarely. > > - is there anything I can look at in gdb to help identify what the > > issue is? > > > > Any suggestions would be appreciated! > > > > - > > Ernie Coskrey > > I've discovered an interesting piece of information that I > think is related to this. I'm hoping this might ring a bell > with someone on the list. > > Looking at _main_tls->stack[], when I've set a breakpoint in > handle_sigsuspend just after the cancelable_wait() call, I > see the following entries: > > 0x6109186f 0x4132ac > > 0x6109186f is "sigdelayed()", which is the routine that > should have been called to deliver the signal and reset the > signal_arrived event. > 0x4132ac is j_waitj (in pdksh). > > So, somehow, when this problem occurs, "sigdelayed" gets > pushed onto the stack *before* j_waitj does. So, _sigbe > never calls sigdelayed. > > I don't think there's ever a case where sigdelayed should be > at _main_tls->stack[0]. However this happened is, I believe, > the cause of this problem. > > Ernie Coskrey > Well, I think that I may have found the cause of this issue, and I believe that the problem exists in 1.5.24-2. Please take a look at what I think is the solution, and let me know if I'm mistaken. I believe that the problem is in _sigbe, at the very end of the assembler code. _sigbe decrements the lock *before* it decrements incyg. This leaves a very small window where another thread - possibly the sig thread that's doing setup_handler() - can acquire the lock, see that incyg is still set to 1, and act accordingly. In setup_handler, this will cause the thread to go into _cygtls::interrupt_setup, which pushes sigdelayed onto the tls stack. But since we're not really in Cygwin code when this happens, sigdelayed() never gets executed and you end up spinning as we're seeing. I'll post a patch to cygwin-patches. Ernie Coskrey -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: cygwin 1.5.20-1, spinning pdksh, 100% CPU
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Ernie Coskrey > Sent: Tuesday, July 31, 2007 3:40 PM > To: cygwin@cygwin.com > Subject: cygwin 1.5.20-1, spinning pdksh, 100% CPU > > > I've run into a problem with cygwin 1.5.20-1 and pdksh > 5.2.14. We've got a pdksh.exe process that is spinning, > using all the CPU. > > This scenario is very hard to reproduce, but has happened on > our test systems occasionally. It occurred recently, and I > currently have gdb attached to the process and have the > symbols loaded. I see that pdksh is continually calling > "sigsuspend()", which is immediately returning from > cancelable_wait due to the fact that the signal_arrived event > is set. I also see that pdksh is waiting for a subprocess to > complete, and has a handle to the PID of that process - > however the process has long since terminated. > > It appears that something went wrong during delivery of SIGCHLD. > > I've got two questions related to this: > > - have there been changes between 1.5.20-1 and 1.5.24-2, or > the latest snapshot, that might have fixed this issue? We've > done some limited testing with 1.5.24-2 and haven't seen this > happen yet, but as I said the it only happens rarely. > - is there anything I can look at in gdb to help identify > what the issue is? > > Any suggestions would be appreciated! > > - > Ernie Coskrey I've discovered an interesting piece of information that I think is related to this. I'm hoping this might ring a bell with someone on the list. Looking at _main_tls->stack[], when I've set a breakpoint in handle_sigsuspend just after the cancelable_wait() call, I see the following entries: 0x6109186f 0x4132ac 0x6109186f is "sigdelayed()", which is the routine that should have been called to deliver the signal and reset the signal_arrived event. 0x4132ac is j_waitj (in pdksh). So, somehow, when this problem occurs, "sigdelayed" gets pushed onto the stack *before* j_waitj does. So, _sigbe never calls sigdelayed. I don't think there's ever a case where sigdelayed should be at _main_tls->stack[0]. However this happened is, I believe, the cause of this problem. Ernie Coskrey -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: cygwin 1.5.20-1, spinning pdksh, 100% CPU
> -Original Message- > From: Igor Peshansky [mailto:[EMAIL PROTECTED] > Sent: Monday, August 06, 2007 5:59 PM > To: Ernie Coskrey > Cc: cygwin@cygwin.com > Subject: RE: cygwin 1.5.20-1, spinning pdksh, 100% CPU > > On Mon, 6 Aug 2007, Ernie Coskrey wrote: > > > > Quite possibly. There were changes to signal handling since > 1.5.20, > > > IIRC. Unless I'm mistaken, there's even a patch for a race > condition > > > in process handling code (though it's not in 1.5.24, I think). > > > > I just want to make sure I understand this - are you talking about a > > change that has been made since 1.5.24-2 was released, which is in > the > > snapshot view now? Or did you mean a fix that was made sometime > between > > 1.5.20-1 and 1.5.24-2. > > I meant the former, but I don't know if these changes have actually > fixed > your problem. I'll download the latest snapshot and look at what's changed. Do you remember where the changes might be located - I'd guess somewhere in sigproc.cc, exception.cc, and/or wait.cc. Or if you remember the date and/or subject of the email discussion that I could look at, that would be very helpful as well. > > Any particulars about the machines on which this happens? Are they > multi-core? I don't recall seeing a cygcheck output from an affected > machine... > Igor This happens on a variety of hardware - single-CPU is where it's the biggest problem since the system becomes nearly unusable. But we've seen it on multi-core and multi-physical-CPU systems as well. Here's cygcheck from one of the systems where it's happened a few times: Cygwin Configuration Diagnostics Current System Time: Tue Aug 07 09:01:03 2007 Windows 2003 Server Ver 5.2 Build 3790 Service Pack 2 Running in Terminal Service session Path: c:\WINDOWS\system32 c:\WINDOWS c:\WINDOWS\System32\Wbem c:\Program Files\SUperior SU c:\Program Files\Microsoft SQL Server\80\Tools\BINN C:\LK\bin c:\SDR c:\SDR\support c:\Program files\Debugging Tools for Windows SysDir: C:\WINDOWS\system32 WinDir: C:\WINDOWS HOME = '/home/Administrator' Use '-r' to scan registry a: fd N/AN/A c: hd NTFS 8662Mb 84% CP CS UN PA FC d: net NTFS 17351Mb 90% CP CS UN PA FC BUILD e: cd N/AN/A h: hd NTFS 4337Mb 1% CP CS UN PA FC Shared_H i: hd N/AN/A j: hd NTFS 17367Mb 1% CP CS UN PA FC Shared_J k: hd NTFS 17367Mb 1% CP CS UN PA FC Shared_K l: hd NTFS 17343Mb 1% CP CS UN PA FC Shared_L n: hd NTFS 17476Mb 1% CP CS UN PA FC Shared_N o: hd NTFS 1027Mb 1% CP CS UN PA FC Shared_O p: hd N/AN/A r: hd N/AN/A s: hd NTFS 69954Mb 1% CP CS UN PA FC iSCSI_S t: hd NTFS 69954Mb 1% CP CS UN PA FC ISCSI_T v: net NTFS 8096Mb 73% CP CS UN PA FC w: net NTFS 1402454Mb 34% CP CSPAcoskrey x: net NTFS 17355Mb 26% CP CS UN PA FC Dev_Y y: hd NTFS 8665Mb 7% CP CS UN PA FC Vol_Y z: hd N/AN/A Found: C:\LK\bin\awk.exe Found: C:\LK\bin\bash.exe Found: C:\LK\bin\cat.exe Found: C:\LK\bin\cp.exe Not Found: cpp (good!) Not Found: crontab Found: C:\LK\bin\find.exe Not Found: gcc Found: C:\LK\bin\gdb.exe Found: C:\LK\bin\grep.exe Found: C:\LK\bin\kill.exe Found: c:\Program files\Debugging Tools for Windows\kill.exe Not Found: ld Found: C:\LK\bin\ls.exe Not Found: make Found: C:\LK\bin\mv.exe Not Found: patch Found: C:\LK\bin\perl.exe Found: C:\LK\bin\rm.exe Found: C:\LK\bin\sed.exe Not Found: ssh Found: C:\LK\bin\sh.exe Found: C:\LK\bin\tar.exe Found: C:\LK\bin\test.exe Found: C:\LK\bin\vi.exe Found: C:\LK\bin\vim.exe 56k 2007/07/14 C:\LK\bin\cygbz2-1.dll 7k 2007/07/14 C:\LK\bin\cygcharset-1.dll 7k 2007/07/14 C:\LK\bin\cygcrypt-0.dll 40k 2007/07/14 C:\LK\bin\cygform-8.dll 45k 2007/07/14 C:\LK\bin\cygform5.dll 35k 2007/07/14 C:\LK\bin\cygform6.dll 48k 2007/07/14 C:\LK\bin\cygform7.dll 28k 2007/07/14 C:\LK\bin\cyggdbm-3.dll 30k 2007/07/14 C:\LK\bin\cyggdbm-4.dll 19k 2007/07/14 C:\LK\bin\cyggdbm.dll 15k 2007/07/14 C:\LK\bin\cyggdbm_compat-3.dll 15k 2007/07/14 C:\LK\bin\cyggdbm_compat-4.dll 17k 2007/07/14 C:\LK\bin\cyghistory4.dll 29k 2007/07/14 C:\LK\bin\cyghistory5.dll 24k 2007/07/14 C:\LK\bin\cyghistory6.dll 947k 2007/07/14 C:\LK\bin\cygiconv-2.dll 22k 2007/07/14 C:\LK\bin\cygintl-1.dll 37k 2007/07/14 C:\LK\bin\cygintl-2.dll 31k 2007/07/14 C:\LK\bin\cygintl-3.dll 21k 2007/07/14 C:\LK\bin\cygintl.dll 21k 2007/07/14 C:\LK\bin\cygmenu-8.dll 26k 2007/07/14 C:\LK\bin\cygmenu5.dll 20k 2007/07/14 C:\LK\bin\cygmenu6.dll 29k 2007/07/14 C:\LK\bin\cygmenu7.dll 67k 2
RE: cygwin 1.5.20-1, spinning pdksh, 100% CPU
> Quite possibly. There were changes to signal handling since 1.5.20, > IIRC. > Unless I'm mistaken, there's even a patch for a race condition in > process handling code (though it's not in 1.5.24, I think). > I just want to make sure I understand this - are you talking about a change that has been made since 1.5.24-2 was released, which is in the snapshot view now? Or did you mean a fix that was made sometime between 1.5.20-1 and 1.5.24-2. > > > > > > Any suggestions would be appreciated! > > > > Posting a sequence of steps that reliably reproduces the > problem for > > you would be great (but not necessarily easy). > We've seen the issue happen with the following scripts. Run a few instances of "tst.sh". Occasionally, one will become hung - if you terminate the other tst.sh with Ctrl-C, you'll see that there's a subtest.sh shell that is using up all the CPU. First - generate "tstfile" by running ls -l /bin > tstfile tst.sh == while true do for ltr in a b c d e f g do out=`./subtest.sh $ltr` echo Found $out date done done subtest.sh == for i in `seq 1 100` do f=`awk '{if(NR == i)print}' i=$i tstfile` m=`/bin/echo $f | grep $1` if [ ! -z "$m" ] then echo $i: $m fi done - Ernie Coskrey SteelEye Technology, Inc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: cygwin 1.5.20-1, spinning pdksh, 100% CPU
0 0x006874b8 0x22c558: 0x0022c588 0x610917b8 0x0042db80 0x0068b3f0 0x22c568: 0x0022c588 0x600301dc 0x006854d8 0x0003 0x22c578: 0x0022c588 0x006874b8 0x006854d8 0x0003 0x22c588: 0x0022c5a8 0x004126e0 0x006842a0 0x 0x22c598: 0x0042972b 0x006874b8 0x 0x006874b8 0x22c5a8: 0x0022c698 0x0040b160 0x0068b3f0 0x 0x22c5b8: 0x0068a614 0x0001 0x0022c680 0x0019 0x22c5c8: 0x0068bbe8 0x 0x61171d44 0x0068 0x22c5d8: 0x 0x 0x61171dd4 0x0001 0x22c5e8: 0x 0x 0x0001 0x 0x22c5f8: 0x0022c640 0x 0x 0x 0x22c608: 0x00687518 0x 0x0004 0x00685470 0x22c618: 0x0068ad98 0x 0x0001 0x61104ab4 0x22c628: 0x0003 0x0001 0x0668 0x0068a614 0x22c638: 0x00685478 0x610564f7 0x0068ad98 0x006854bc 0x22c648: 0x0001 0x0068ad60 0x0068ad60 0x 0x22c658: 0x00685ae0 0x0001 0x 0x0068b3f0 0x22c668: 0x0022c698 0x0041 0x00685530 0x006854b0 0x22c678: 0x0080 0x0068a614 0x0001 0x006854bc 0x22c688: 0x00cb 0x006874b8 0x 0x00687350 0x22c698: 0x0022c6c8 0x0040a654 0x006874b8 0x0022c6b0 0x22c6a8: 0x0020 0x6105642c 0x00685498 0x00685498 0x22c6b8: 0x0068549c 0x001d 0x 0x 0x22c6c8: 0x0022c718 0x0040d80a 0x006874b8 0x0020 0x22c6d8: 0x0068a610 0x 0x001d 0x Ernie Coskrey SteelEye Technology, Inc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: cygwin 1.5.20-1, spinning pdksh, 100% CPU
> -Original Message- > From: Igor Peshansky > > On Tue, 31 Jul 2007, Ernie Coskrey wrote: > > > I've run into a problem with cygwin 1.5.20-1 and pdksh 5.2.14. We've > > got a pdksh.exe process that is spinning, using all the CPU. > > > > This scenario is very hard to reproduce, but has happened on our test > > systems occasionally. It occurred recently, and I currently have gdb > > attached to the process and have the symbols loaded. > > I assume you've rebuilt pdksh from source, since the packaged binary is > stripped... Do you also have the symbols for the Cygwin DLL? Yes, I've built both pdksh and cygwin1.dll from source and have the symbols. > > > I see that pdksh is continually calling "sigsuspend()", which is > > immediately returning from cancelable_wait due to the fact that the > > signal_arrived event is set. > > Do you mean the sigpause() call? Can you see which signal it attempts > to > suspend? Can you email me (privately, if you wish) the stack dump from > gdb? > It's sigsuspend() in j_waitj - line 1191 in jobs.c. It calls sigsuspend(&sm_default), and sm_default is 0 (no signals are blocked). This immediately returns, and I see that j->state is still PRUNNING every time. > > I also see that pdksh is waiting for a subprocess to complete, and > has a > > handle to the PID of that process - however the process has long > since > > terminated. > > That's normal (I think). Cygwin may not deliver SIGCHLD immediately > after > process termination. Until pdksh gets SIGCHLD, it'll keep the process > handle. > > > It appears that something went wrong during delivery of SIGCHLD. > > Does this happen before or after j_sigchld() gets invoked? > I suspect that j_sigchld never got invoked, or didn't run properly, but can't definitvely prove that. > > I've got two questions related to this: > > > > - have there been changes between 1.5.20-1 and 1.5.24-2, or the > latest > > snapshot, that might have fixed this issue? We've done some limited > > testing with 1.5.24-2 and haven't seen this happen yet, but as I said > > the it only happens rarely. > > Quite possibly. There were changes to signal handling since 1.5.20, > IIRC. > Unless I'm mistaken, there's even a patch for a race condition in > process > handling code (though it's not in 1.5.24, I think). > > > - is there anything I can look at in gdb to help identify what the > issue > > is? > > > > Any suggestions would be appreciated! > > Posting a sequence of steps that reliably reproduces the problem for > you > would be great (but not necessarily easy). I wish I could supply this, but the problem happens very rarely. I've run many thousands of test shell iterations and haven't seen it reoccur yet. > > As I said above, a stack dump (with full pdksh symbols) would help... > That might mean that you'd need to build an unstripped pdksh and > attempt > to reproduce the problem again. > Igor > -- Here's a stack trace of the thread where the spin is occurring. The other threads in the process are quiet - the signal thread is is ReadFile as expected, and the other threads are all in stub routines doing WaitForSingleObject. (gdb) bt #0 handle_sigsuspend (tempmask=0) at ../../../../src/winsup/cygwin/exceptions.cc:694 #1 0x61094b93 in sigsuspend (set=0x42db80) at ../../../../src/winsup/cygwin/signal.cc:477 #2 0x610917b8 in _sigfe () at ../../../../src/winsup/cygwin/cygserver.h:82 #3 0x0022c588 in ?? () #4 0x600301dc in ?? () #5 0x006854d8 in ?? () #6 0x0003 in ?? () #7 0x0022c588 in ?? () #8 0x006874b8 in ?? () #9 0x006854d8 in ?? () #10 0x0003 in ?? () #11 0x0022c5a8 in ?? () #12 0x004126e0 in waitlast () at ../src/jobs.c:729 #13 0x004126e0 in waitlast () at ../src/jobs.c:729 #14 0x0040b160 in expand ( cp=0x6874b8 "\001R\001M\001T\001I\001N\001S\001R\001E\001A\001S\001O\001N\001=\003$L KBIN/ins_list -d \"$EQVRMTSYS\" -t \"$EQVRMTTAG\" 2>NUL: | cut -d\001 -f8", wp=0x22c6b0, f=32) at ../src/eval.c:533 #15 0x0040a654 in evalstr ( cp=0x6874b8 "\001R\001M\001T\001I\001N\001S\001R\001E\001A\001S\001O\001N\001=\003$L KBIN/ins_list -d \"$EQVRMTSYS\" -t \"$EQVRMTTAG\" 2>NUL: | cut -d\001 -f8", f=32) at ../src/eval.c:113 #16 0x0040d80a in comexec (t=0x6871e0, tp=0x0, ap=0x687350, flags=0) at ../src/exec.c:555 #17 0x0040cc7d in execute (t=0x6871e0, flags=0) at ../src/exec.c:155 #18 0x0040ce39 in execute (t=0x687778, flags=0) at ../src/exec.c:192 #19 0x0040d311 in execute (t=0x686620, flags=1) at ../src/exec.c:367 #20 0x004124c1 in ex
cygwin 1.5.20-1, spinning pdksh, 100% CPU
I've run into a problem with cygwin 1.5.20-1 and pdksh 5.2.14. We've got a pdksh.exe process that is spinning, using all the CPU. This scenario is very hard to reproduce, but has happened on our test systems occasionally. It occurred recently, and I currently have gdb attached to the process and have the symbols loaded. I see that pdksh is continually calling "sigsuspend()", which is immediately returning from cancelable_wait due to the fact that the signal_arrived event is set. I also see that pdksh is waiting for a subprocess to complete, and has a handle to the PID of that process - however the process has long since terminated. It appears that something went wrong during delivery of SIGCHLD. I've got two questions related to this: - have there been changes between 1.5.20-1 and 1.5.24-2, or the latest snapshot, that might have fixed this issue? We've done some limited testing with 1.5.24-2 and haven't seen this happen yet, but as I said the it only happens rarely. - is there anything I can look at in gdb to help identify what the issue is? Any suggestions would be appreciated! - Ernie CoskreySteelEye Technology, Inc. 803-808-4275 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Cygwin build error
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf > Of Corinna Vinschen > Sent: Friday, April 28, 2006 4:28 AM > To: [EMAIL PROTECTED] > Cc: cygwin@cygwin.com > Subject: Re: Cygwin build error > > > This is a newlib problem. I've redirected this mail to the > appropriate > list newlib AT sourceware DOT org. > > On Apr 27 15:14, Ernie Coskrey wrote: > > I ran into the following problem building the latest cygwin > snapshot: > > > > configure: loading cache .././config.cache > > configure: error: `CFLAGS' has changed since the previous run: > > configure: former value: -O2 -g -O2 > > configure: current value: -O2 -g -O2 > > configure: error: changes in the environment can compromise > the build > > configure: error: run `make distclean' and/or `rm > .././config.cache' and start over > > configure: error: /bin/sh > '../../../../src/newlib/libc/configure' failed for libc > > > > By piping the output to a file, I saw that the former value > of CFLAGS is "-O2 -g -O2 " (two spaces), while the current > value is "-O2 -g -O2 " (one space). This causes the > comparison in libc/configure to fail. > > > > The way I've resolved this is to replace the following line: > > > > if test "x$ac_old_val" != "x$ac_new_val"; then > > > > with > > > > if test "`echo $ac_old_val`" != "`echo $ac_new_val`"; then > > > > wherever it appears in any "configure" script (there are 75 > configure scripts that contain this test, BTW). There may be > a more elegant way around this, but I haven't found it. > Running "make distclean" or removing config.cache doesn't > resolve the problem. > > > > - > > Ernie Coskrey SteelEye Technology, Inc.803-461-3875 > > > Corinna > This problem isn't limited to newlib: the same fix must be applied to a number of non-newlib configure scripts. However, I have found a simpler solution than patching all 70-plus configure scripts. The root of the problem is that the variable "CFLAGS_FOR_TARGET" gets defined in the top-level Makefile as follows: CFLAGS_FOR_TARGET = -O2 $(CFLAGS) $(SYSROOT_CFLAGS_FOR_TARGET) Since SYSROOT_CFLAGS_FOR_TARGET is usually empty, you end up with an extra space at the end of CFLAGS_FOR_TARGET (in my case, anyway). The following patch will resolve the problem without requiring any changes in the underlying configure scripts. This patch is for "src/Makefile.in" - the top-level Makefile.in. It uses the "strip" command to remove the extra whitespace from CFLAGS_FOR_TARGET and CXXFLAGS_FOR_TARGET. --- Makefile.in.ORIG2006-05-31 08:49:14.16650 -0400 +++ Makefile.in 2006-05-31 11:08:25.150875000 -0400 @@ -383,7 +383,7 @@ # CFLAGS will be just -g. We want to ensure that TARGET libraries # (which we know are built with gcc) are built with optimizations so # prepend -O2 when setting CFLAGS_FOR_TARGET. -CFLAGS_FOR_TARGET = -O2 $(CFLAGS) $(SYSROOT_CFLAGS_FOR_TARGET) +CFLAGS_FOR_TARGET = $(strip -O2 $(CFLAGS) $(SYSROOT_CFLAGS_FOR_TARGET)) SYSROOT_CFLAGS_FOR_TARGET = @SYSROOT_CFLAGS_FOR_TARGET@ # If GCC_FOR_TARGET is not overriden on the command line, then this @@ -423,7 +423,7 @@ fi; \ fi` -CXXFLAGS_FOR_TARGET = $(CXXFLAGS) $(SYSROOT_CFLAGS_FOR_TARGET) +CXXFLAGS_FOR_TARGET = $(strip $(CXXFLAGS) $(SYSROOT_CFLAGS_FOR_TARGET)) LIBCXXFLAGS_FOR_TARGET = $(CXXFLAGS_FOR_TARGET) -fno-implicit-templates GCJ_FOR_TARGET=$(STAGE_CC_WRAPPER) @GCJ_FOR_TARGET@ $(FLAGS_FOR_TARGET) - Ernie Coskrey -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Cygwin build error
I ran into the following problem building the latest cygwin snapshot: configure: loading cache .././config.cache configure: error: `CFLAGS' has changed since the previous run: configure: former value: -O2 -g -O2 configure: current value: -O2 -g -O2 configure: error: changes in the environment can compromise the build configure: error: run `make distclean' and/or `rm .././config.cache' and start over configure: error: /bin/sh '../../../../src/newlib/libc/configure' failed for libc By piping the output to a file, I saw that the former value of CFLAGS is "-O2 -g -O2 " (two spaces), while the current value is "-O2 -g -O2 " (one space). This causes the comparison in libc/configure to fail. The way I've resolved this is to replace the following line: if test "x$ac_old_val" != "x$ac_new_val"; then with if test "`echo $ac_old_val`" != "`echo $ac_new_val`"; then wherever it appears in any "configure" script (there are 75 configure scripts that contain this test, BTW). There may be a more elegant way around this, but I haven't found it. Running "make distclean" or removing config.cache doesn't resolve the problem. - Ernie Coskrey SteelEye Technology, Inc.803-461-3875 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Call for testing Cygwin snapshot
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf > Of Christopher Faylor > Sent: Tuesday, April 25, 2006 4:39 PM > To: cygwin@cygwin.com > Subject: Re: Call for testing Cygwin snapshot > > > On Tue, Apr 25, 2006 at 04:33:37PM -0400, Ernie Coskrey wrote: > >Well, what I got from your message was that you were pretty sure that > >your fix may have addressed the problem, but not 100% sure. > That's why > >I posted this follow-up; it's possible that Jerry has found > a scenario > >that causes this problem to occur. Maybe not, but if he can > reproduce > >it it would be worth checking. > > There are all sorts of "cygwin hang" bug reports out there. > Since this > is a problem that showed up in a particular snapshot and the problem > that you are talking about was something that supposedly happened for > any version of cygwin from 2003 to (at least) February 2006, > I don't see > any reason to think that this is an issue since it would also > show up in > a pre 2006-03-13 version of cygwin -- unless you have some > insight into > the problem that I'm missing. > > cgf > Nope, no additional insight, just a hunch -- that apparently has turned out to be wrong. :-) BTW, we're not seeing ANY hangs in the 1.5.20 snapshots; we'd reported a few in 1.5.19-4 and those all have been addressed. Ernie -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Call for testing Cygwin snapshot
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf > Of Christopher Faylor > Sent: Tuesday, April 25, 2006 4:26 PM > To: cygwin@cygwin.com > Subject: Re: Call for testing Cygwin snapshot > > > On Tue, Apr 25, 2006 at 03:37:46PM -0400, Ernie Coskrey wrote: > >> -Original Message- > >> From: cygwin-owner > >> [mailto:cygwin-owner]... > > Btw, to the OP: *please* don't quote raw email addresses, especially > when it's the cygwin or cygwin-owner email address. Adding > this is just > noise and helps increase the already incredible spam burden > presented to > the cygwin and (especially) postmaster mailing lists. > > >> Of Jerry D. Hedden > >> Sent: Tuesday, April 25, 2006 9:27 AM > >> To: cygwin >^^ > >> Subject: RE: Call for testing Cygwin snapshot > >> > >> > >>As I said, these sort of problems started after the 2006-03-09 > >>snapshot. I double checked, and the problem does occur with the > >>2006-03-13 snapshot. > > > >I wonder if this might be related to the following: > > > >http://cygwin.com/ml/cygwin/2006-02/msg01062.html > > > >The fix suggested in the original message - > >http://www.cygwin.com/ml/cygwin-patches/2003-q2/msg4.html - might > >help. > > You've pointed to my message which indicates that I've fixed this in > another way. And, the OP indicates that this hang was introduced in a > specific snapshot so I don't see why this would be an issue in that > snapshot. > > Nevertheless, the patch in the message that you are referring to is > still a band-aid and still will not be applied. > > cgf > Well, what I got from your message was that you were pretty sure that your fix may have addressed the problem, but not 100% sure. That's why I posted this follow-up; it's possible that Jerry has found a scenario that causes this problem to occur. Maybe not, but if he can reproduce it it would be worth checking. I agree that the original patch is a band-aid and shouldn't be applied. There were some follow-ups to that message that talked about different ways to address the problem, if it turns out that Jerry's problem is the same. Ernie -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Call for testing Cygwin snapshot
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf > Of Jerry D. Hedden > Sent: Tuesday, April 25, 2006 9:27 AM > To: cygwin@cygwin.com > Subject: RE: Call for testing Cygwin snapshot > > > Jerry D. Hedden wrote: > >I have a cron job (a bash script) > >that runs every 6 minutes, polling and downloading info off the web. > > > >The problem is the script hangs at various places and the stuck > >processes keep building up. > > > >Further, I have to kill these processes using the task monitor: kill > >reports 'No such process'. > > Christopher Faylor replied: > > As mentioned above, a test case showing the problem sure > would be nifty. > > I agree and would have provided one if I could. However, I > have no idea > what is causing this, nor how to write a test case for it. > > As I said, it's a cron job running a bash script - nothing fancy. The > hang does not happen on every invokation of the script, but it does > occur frequently. Where in the script it gets stuck seems to be > random: wget, mkdir, mv, date, diff, etc.. > > > Also, knowing the first snapshot which shows the problem > would be helpful. > > As I said, these sort of problems started after the > 2006-03-09 snapshot. > I double checked, and the problem does occur with the 2006-03-13 > snapshot. > > I wonder if this might be related to the following: http://cygwin.com/ml/cygwin/2006-02/msg01062.html The fix suggested in the original message - http://www.cygwin.com/ml/cygwin-patches/2003-q2/msg4.html - might help. Ernie Coskrey SteelEye Technology, Inc. -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: Shells hang during script execution
>On Wed, Mar 01, 2006 at 01:01:46PM -0500, Ernie Coskrey wrote: >>>>Here's a description of a second hang condition we were encountering, along >>>>with a patch for it. >>>> >>>> >>>>The application (pdksh in this case) does a read on a pipe, which >>>>eventually >>>>calls pipe.cc fhandler_pipe::read in Thread 1. This creates a new >>>>cygthread >>>>with "read_pipe()" as the function. Then >it calls th->detach(read_state). >>>> >>>>When the hang occurs, the new thread gets terminated early, before >>>>cygthread::stub() can call "callfunc()". You see the error message >>>>"erroneous thread activation". I'm not sure what's causing the thread >>>>to fail activation, but the result is, the read_state semaphore never >>>>gets signalled. >>> >>>Sorry but this is another band-aid around a problem. The real problem >>>is that the code shouldn't get into the state that you are describing. >>>That's why cygwin prints an error message - it is a serious problem. >>>Making the code deal gracefully with a problem like this isn't going >>>to solve the underlying issue. >>> >>>If you can figure out what's causing the erroneous thread activation >>>then that will be the real culprit. >>> >>>cgf >>> >> >>OK, I believe I've tracked this down. >> >>The problem occurs when we get into a read_pipe cygthread constructor >>(cygthread::cygthread()) with a NULL h and an ev that is signalled. >>When this condition exists, a hang can occur as follows: >> >>1) Creator thread calls detach(). This waits for pipe_state to be released >>twice >>2) read_pipe thread calls read_pipe, reads data, and releases the semaphore >>twice >>3) Creator thread goes to WFSO(*this, INFINITE) which returns immediately >>because ev was set when the thread was created. >>4) Creator thread initiates another read_pipe cygthread to read more pipe >>data. >> >>At this point, there's a race: if the Creator thread gets past the >>initialization part of the constuctor, which sets __name(name), BEFORE >>the original read_pipe thread gets to the part of cygthread::stub() >>that sets info->__name = NULL, then you'll see the hang. The new >>pipe_read will give the "erroneous thread activation" message, and the >>parent will be stuck waiting for data that will never arrive. >> >>The only path that leaves an unused thread structure in a state where >>h==NULL and ev is signalled is cygthread::release(). So the fix is >>simple: >> >>$ cat cygthread.cc.udiff >>--- cygthread.cc.ORIG 2006-02-22 10:57:42.123931300 -0500 >>+++ cygthread.cc 2006-03-01 12:59:23.255023000 -0500 >>@@ -268,7 +268,12 @@ >> cygthread::release (bool nuke_h) >> { >> if (nuke_h) >>+{ >> h = NULL; >>+ >>+if (ev) >>+ ResetEvent (ev); >>+} >> #ifdef DEBUGGING >> __oldname = __name; >> debug_printf ("released thread '%s'", __oldname); > >Nice analysis. Thank you. I think it's easier to fix this by just >making the ev event auto-reset then this condition would be caught in >terminate thread, as it was meant to be. > >cgf Here's a patch for the problem that works with the latest snapshot. - Ernie Coskrey SteelEye Technology, Inc. --- cygthread.cc.ORIG 2006-03-01 17:40:44.0 -0500 +++ cygthread.cc2006-03-16 14:54:04.148312500 -0500 @@ -78,7 +78,7 @@ debug_printf ("thread '%s', id %p, stack_ptr %p", info->name (), info->id, info->stack_ptr); if (!info->ev) { - info->ev = CreateEvent (&sec_none_nih, TRUE, FALSE, NULL); + info->ev = CreateEvent (&sec_none_nih, FALSE, FALSE, NULL); info->thread_sync = CreateEvent (&sec_none_nih, FALSE, FALSE, NULL); } } @@ -197,8 +197,6 @@ HANDLE htobe; if (h) { - if (ev) - ResetEvent (ev); while (!thread_sync) low_priority_sleep (0); SetEvent (thread_sync); @@ -223,7 +221,6 @@ while (!ev) low_priority_sleep (0); WaitForSingleObject (ev, INFINITE); - ResetEvent (ev); } h = htobe; } -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Shells hang during script execution
>>Here's a description of a second hang condition we were encountering, along >>with a patch for it. >> >> >>The application (pdksh in this case) does a read on a pipe, which eventually >>calls pipe.cc fhandler_pipe::read in Thread 1. This creates a new cygthread >>with "read_pipe()" as the function. Then >it calls th->detach(read_state). >> >>When the hang occurs, the new thread gets terminated early, before >>cygthread::stub() can call "callfunc()". You see the error message >>"erroneous thread activation". I'm not sure what's causing the thread >>to fail activation, but the result is, the read_state semaphore never >>gets signalled. > >Sorry but this is another band-aid around a problem. The real problem >is that the code shouldn't get into the state that you are describing. >That's why cygwin prints an error message - it is a serious problem. >Making the code deal gracefully with a problem like this isn't going >to solve the underlying issue. > >If you can figure out what's causing the erroneous thread activation >then that will be the real culprit. > >cgf > OK, I believe I've tracked this down. The problem occurs when we get into a read_pipe cygthread constructor (cygthread::cygthread()) with a NULL h and an ev that is signalled. When this condition exists, a hang can occur as follows: 1) Creator thread calls detach(). This waits for pipe_state to be released twice 2) read_pipe thread calls read_pipe, reads data, and releases the semaphore twice 3) Creator thread goes to WFSO(*this, INFINITE) which returns immediately because ev was set when the thread was created. 4) Creator thread initiates another read_pipe cygthread to read more pipe data. At this point, there's a race: if the Creator thread gets past the initialization part of the constuctor, which sets __name(name), BEFORE the original read_pipe thread gets to the part of cygthread::stub() that sets info->__name = NULL, then you'll see the hang. The new pipe_read will give the "erroneous thread activation" message, and the parent will be stuck waiting for data that will never arrive. The only path that leaves an unused thread structure in a state where h==NULL and ev is signalled is cygthread::release(). So the fix is simple: $ cat cygthread.cc.udiff --- cygthread.cc.ORIG 2006-02-22 10:57:42.123931300 -0500 +++ cygthread.cc2006-03-01 12:59:23.255023000 -0500 @@ -268,7 +268,12 @@ cygthread::release (bool nuke_h) { if (nuke_h) +{ h = NULL; + +if (ev) + ResetEvent (ev); +} #ifdef DEBUGGING __oldname = __name; debug_printf ("released thread '%s'", __oldname); -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Shells hang during script execution
Here's a description of a second hang condition we were encountering, along with a patch for it. The application (pdksh in this case) does a read on a pipe, which eventually calls pipe.cc fhandler_pipe::read in Thread 1. This creates a new cygthread with "read_pipe()" as the function. Then it calls th->detach(read_state). When the hang occurs, the new thread gets terminated early, before cygthread::stub() can call "callfunc()". You see the error message "erroneous thread activation". I'm not sure what's causing the thread to fail activation, but the result is, the read_state semaphore never gets signalled. So Thread 1 goes into cygthread::detach(read_state). The first thing that happens is signal_arrived is set. The old code would then set n=1, but leave howlong=INFINITE. My change sets howlong=100 in this case. Then, when TIMEOUT occurs, we look to see if __name is not NULL. Since the thread was terminated, its name is now NULL, so it doesn't decrement i, and eventually you break out of the loop and clean up as expected. --- cygthread.cc.ORIG 2006-02-22 10:57:42.123931300 -0500 +++ cygthread.cc2006-02-23 15:50:23.894461500 -0500 @@ -374,10 +374,12 @@ break; case WAIT_OBJECT_0 + 1: n = 1; - if (i--) - howlong = 50; + i--; + howlong = 100; break; case WAIT_TIMEOUT: + if(!i && __name) + i--; break; default: if (!exiting) > -Original Message- > From: Ernie Coskrey > Sent: Friday, February 10, 2006 1:31 PM > To: Ernie Coskrey; 'cygwin@cygwin.com' > Subject: RE: Shells hang during script execution > > > We've been able to narrow this down some more. The shell > gets hung in sigsuspend(), waiting for SIGCHLD. We've > verified that the process that's executed as part of the > command substitution does complete, and returns EOF, and the > shell (we're testing with pdksh) goes into sigsuspend and > never comes out. > > If we execute "kill -CHLD ", the shell resumes its processing. > > I'm going to continue to look into this - if anybody has any > insight into how SIGCHLD might be getting lost, please let me > know. Thanks! > > Ernie Coskrey > > > -Original Message- > From: Ernie Coskrey > Sent: Wed 2/1/2006 3:27 PM > To: 'cygwin@cygwin.com' > Subject: Shells hang during script execution > > I've run into problems with shell scripts hanging during > execution for no apparent reason. I've narrowed down my test > case to two simple shell scripts. To reproduce the problem, > I ran three instances of the "top.sh" script included here, > and after a bit (30 minutes to an hour or so) I'll see that > one or two of the shells have just stopped in their tracks. > > Here are the scripts: > > > dir=$1 > loops=$2 > > for loop in `seq 1 $loops` > do > x=`./subtest.sh $dir` > date > echo loop $loop > done > > > for j in `ls $1` > do > if [ `echo $j | egrep -i "A|B" | wc -l` -ne 0 ] > then > echo $j > fi > done > echo subtest1 done >&2 > > > > I then ran three bash shells. The commands I ran, > simultaneously, were: > > 1) ./top.sh C:/ 600 > 2) ./top.sh C:/windows 300 > 3) ./top.sh C:/windows/system32 100 > > These ran for about 45 minutes, and then I noticed that two > of them (1 and 2 above) had stopped printing any output. The > third was still moving along. The third completed, but the > first two never progressed any further. I used Process > Explorer from ntinternals.com, and saw that the two hung > shells were not using any CPU, and did not have any child > processes created; they were simply stopped. If a process > dump would be helpful, I can generate one with Windbg or gdb. > > > - > Ernie Coskrey SteelEye Technology, Inc.803-461-3875 > > -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Shells hang during script execution
There are two hang conditions that we've identified and have developed fixes for. This is a description of the first of the two along with a patch; I'll follow up with a description and patch for the second. If a signal can't be handled because it is blocked, it gets queued (on the process's "sigq") to be handled later. Now, whenever the process's signal mask changes (e.g., the signal in question gets unblocked), an attempt is made to handle all the queued signals (i.e., a signal flush occurs). However, if the queueing of the blocked signal happens right after the signal mask change, then we miss the signal. This causes the process to hang. The signal is on the queue, but the process doesn't know to check for it. The process just hangs until another signal gets sent to it. The workaround is basically to force the signal queue to be rescanned (flushed) whenever we add something to it, so a queued signal is never missed. --- sigproc.cc.ORIG 2006-02-16 14:02:42.81432 -0500 +++ sigproc.cc 2006-02-22 10:55:20.327209900 -0500 @@ -1130,6 +1130,7 @@ case __SIGNOHOLD: case __SIGFLUSH: case __SIGFLUSHFAST: +flush: sigq.reset (); while ((q = sigq.next ())) { @@ -1150,6 +1151,8 @@ else { int sig = pack.si.si_signo; + if (sig == SIGCHLD) + clearwait = true; // FIXME: REALLY not right when taking threads into consideration. // We need a per-thread queue since each thread can have its own // list of blocked signals. CGF 2005-08-24 @@ -1165,10 +1168,11 @@ system_printf ("Failed to arm signal %d from pid %d", pack.sig, pack.pid); #endif sigq.add (pack); // FIXME: Shouldn't add this in !sh condition + goto flush; // signal may have become unblocked while + // we were processing it (before we added + // it to the sigq) -- flush sigq to be sure } } - if (sig == SIGCHLD) - clearwait = true; } break; } > -Original Message- > From: Ernie Coskrey > Sent: Friday, February 10, 2006 1:31 PM > To: Ernie Coskrey; 'cygwin@cygwin.com' > Subject: RE: Shells hang during script execution > > > We've been able to narrow this down some more. The shell > gets hung in sigsuspend(), waiting for SIGCHLD. We've > verified that the process that's executed as part of the > command substitution does complete, and returns EOF, and the > shell (we're testing with pdksh) goes into sigsuspend and > never comes out. > > If we execute "kill -CHLD ", the shell resumes its processing. > > I'm going to continue to look into this - if anybody has any > insight into how SIGCHLD might be getting lost, please let me > know. Thanks! > > Ernie Coskrey > > > -Original Message- > From: Ernie Coskrey > Sent: Wed 2/1/2006 3:27 PM > To: 'cygwin@cygwin.com' > Subject: Shells hang during script execution > > I've run into problems with shell scripts hanging during > execution for no apparent reason. I've narrowed down my test > case to two simple shell scripts. To reproduce the problem, > I ran three instances of the "top.sh" script included here, > and after a bit (30 minutes to an hour or so) I'll see that > one or two of the shells have just stopped in their tracks. > > Here are the scripts: > > > dir=$1 > loops=$2 > > for loop in `seq 1 $loops` > do > x=`./subtest.sh $dir` > date > echo loop $loop > done > > > for j in `ls $1` > do > if [ `echo $j | egrep -i "A|B" | wc -l` -ne 0 ] > then > echo $j > fi > done > echo subtest1 done >&2 > > > > I then ran three bash shells. The commands I ran, > simultaneously, were: > > 1) ./top.sh C:/ 600 > 2) ./top.sh C:/windows 300 > 3) ./top.sh C:/windows/system32 100 > > These ran for about 45 minutes, and then I noticed that two > of them (1 and 2 above) had stopped printing any output. The > third was still moving along. The third completed, but the > first two never progressed any further. I used Process > Explorer from ntinternals.com, and saw that the two hung > shells were not using any CPU, and did not have any child > processes created; they were simply stopped. If a process > dump would be helpful, I can generate one with Windbg or gdb. > > - > Ernie Coskrey SteelEye Technology, Inc.803-461-3875 > > -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
RE: Shells hang during script execution
We've been able to narrow this down some more. The shell gets hung in sigsuspend(), waiting for SIGCHLD. We've verified that the process that's executed as part of the command substitution does complete, and returns EOF, and the shell (we're testing with pdksh) goes into sigsuspend and never comes out. If we execute "kill -CHLD ", the shell resumes its processing. I'm going to continue to look into this - if anybody has any insight into how SIGCHLD might be getting lost, please let me know. Thanks! Ernie Coskrey -----Original Message- From: Ernie Coskrey Sent: Wed 2/1/2006 3:27 PM To: 'cygwin@cygwin.com' Subject: Shells hang during script execution I've run into problems with shell scripts hanging during execution for no apparent reason. I've narrowed down my test case to two simple shell scripts. To reproduce the problem, I ran three instances of the "top.sh" script included here, and after a bit (30 minutes to an hour or so) I'll see that one or two of the shells have just stopped in their tracks. Here are the scripts: dir=$1 loops=$2 for loop in `seq 1 $loops` do x=`./subtest.sh $dir` date echo loop $loop done for j in `ls $1` do if [ `echo $j | egrep -i "A|B" | wc -l` -ne 0 ] then echo $j fi done echo subtest1 done >&2 I then ran three bash shells. The commands I ran, simultaneously, were: 1) ./top.sh C:/ 600 2) ./top.sh C:/windows 300 3) ./top.sh C:/windows/system32 100 These ran for about 45 minutes, and then I noticed that two of them (1 and 2 above) had stopped printing any output. The third was still moving along. The third completed, but the first two never progressed any further. I used Process Explorer from ntinternals.com, and saw that the two hung shells were not using any CPU, and did not have any child processes created; they were simply stopped. If a process dump would be helpful, I can generate one with Windbg or gdb. Here's my cygcheck output: Cygwin Configuration Diagnostics Current System Time: Wed Feb 01 15:07:43 2006 Windows 2003 Server Ver 5.2 Build 3790 Service Pack 1 Path: C:\WINDOWS\system32 C:\WINDOWS C:\WINDOWS\System32\Wbem C:\Program Files\Microsoft SQL Server\80\Tools\BINN C:\Program Files\SUperior SU Output from C:\cygwin\bin\id.exe (nontsec) UID: 500(Administrator) GID: 513(None) 0(root) 513(None) 544(Administrators) 545(Users) Output from C:\cygwin\bin\id.exe (ntsec) UID: 500(Administrator) GID: 513(None) 0(root) 513(None) 544(Administrators) 545(Users) SysDir: C:\WINDOWS\system32 WinDir: C:\Documents and Settings\Administrator\WINDOWS Here's some environment variables that may affect cygwin: PWD = '/usr/bin' HOME = '/home/Administrator' Here's the rest of your environment variables: HOMEPATH = '\Documents and Settings\Administrator' APPDATA = 'C:\Documents and Settings\Administrator\Application Data' TERM = 'cygwin' PROCESSOR_IDENTIFIER = 'x86 Family 15 Model 2 Stepping 7, GenuineIntel' WINDIR = 'C:\WINDOWS' TMPDIR = '/cygdrive/c/Documents and Settings/Administrator/Local Settings/Temp' USERDOMAIN = 'EAGLE' OS = 'Windows_NT' ALLUSERSPROFILE = 'C:\Documents and Settings\All Users' TEMP = '/cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp' COMMONPROGRAMFILES = 'C:\Program Files\Common Files' USERNAME = 'Administrator' CLUSTERLOG = 'C:\WINDOWS\Cluster\cluster.log' PROCESSOR_LEVEL = '15' FP_NO_HOST_CHECK = 'NO' SYSTEMDRIVE = 'C:' USERPROFILE = 'C:\Documents and Settings\Administrator' LOGONSERVER = '\\EAGLE' PROCESSOR_ARCHITECTURE = 'x86' !C: = 'C:\cygwin\bin' EXTMIRRBASE = 'C:\LKDR' SHLVL = '1' PATHEXT = '.COM;.EXE;.BAT;.CMD;.VBS;.VBE;.JS;.JSE;.WSF;.WSH' HOMEDRIVE = 'C:' PROMPT = '$P$G' COMSPEC = 'C:\WINDOWS\system32\cmd.exe' TMP = '/cygdrive/c/DOCUME~1/ADMINI~1/LOCALS~1/Temp' SYSTEMROOT = 'C:\WINDOWS' PROCESSOR_REVISION = '0207' PROGRAMFILES = 'C:\Program Files' NUMBER_OF_PROCESSORS = '2' SESSIONNAME = 'Console' COMPUTERNAME = 'EAGLE' !EXITCODE = '0001' _ = './cygcheck' POSIXLY_CORRECT = '1' Scanning registry for keys with 'Cygnus' in them... HKEY_CURRENT_USER\Software\Cygnus Solutions HKEY_CURRENT_USER\Software\Cygnus Solutions\Cygwin HKEY_CURRENT_USER\Software\Cygnus Solutions\Cygwin\mounts v2 HKEY_CURRENT_USER\Software\Cygnus Solutions\Cygwin\Program Options HKEY_CURRENT_USER\Software\SteelEye\LifeKeeper\Cygnus Solutions HKEY_CURRENT_USER\Softwa
Shells hang during script execution
win Package Information Last downloaded files to: C:\cygwinpkg Last downloaded files from: ftp://ftp.cise.ufl.edu/pub/mirrors/cygwin Package Version _update-info-dir 00352-1 alternatives 1.3.20a-2 ash 20040127-3 base-files 3.7-1 base-passwd 2.2-1 bash 3.0-14 bzip21.0.3-1 coreutils5.93-3 crypt1.1-1 cygutils 1.2.9-1 cygwin 1.5.19-4 cygwin-doc 1.4-3 diffutils2.8.7-1 editrights 1.01-1 findutils4.2.27-1 gawk 3.1.5-2 gdb 20041228-3 gdbm 1.8.3-7 grep 2.5.1a-2 groff1.18.1-2 gzip 1.3.5-1 less 381-1 libbz2_1 1.0.3-1 libcharset1 1.9.2-2 libgdbm 1.8.0-5 libgdbm-devel 1.8.3-7 libgdbm3 1.8.3-3 libgdbm4 1.8.3-7 libiconv 1.9.2-2 libiconv21.9.2-2 libintl 0.10.38-3 libintl1 0.10.40-1 libintl2 0.12.1-3 libintl3 0.14.5-1 libncurses5 5.2-1 libncurses6 5.2-8 libncurses7 5.3-4 libncurses8 5.4-4 libpcre0 6.3-1 libpopt0 1.6.4-4 libreadline4 4.1-2 libreadline5 4.3-5 libreadline6 5.1-2 login1.9-7 man 1.5p-1 mktemp 1.5-3 ncurses 5.4-4 pdksh5.2.14-3 run 1.1.6-1 sed 4.1.4-1 tar 1.15.1-3 tcltk20030901-1 termcap 20050421-1 terminfo 5.4_20041009-1 texinfo 4.8-1 vim 6.4-4 which1.7-1 zlib 1.2.3-1 Thanks for any help you can provide on this! - Ernie Coskrey SteelEye Technology, Inc.803-461-3875 -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/