Re: hang when using pthread and fork in 1.5.23-1 and snapshot 20070118, and now 1.5.24-1.
On Jan 31, 2007, at 6:46 AM, Brian Ford wrote: On Fri, 26 Jan 2007, Peter Rehley wrote: Hello, I tried the latest release of cygwin1.dll (1.5.24-1) and it still is hanging in the same way. I've tried to debug further with gdb, but so far I haven't got any useful information out of gdb. I'll keep trying to get some debug information, but if any one else can reproduce the problem I would be most appreciative. I can reproduce a problem. Your descriptions of it are a bit hard to follow, so I'm not sure if it is your problem or not. Unfortunately, I don't have time to debug it right now. I do have a few comments, though. hmmm, rereading those descriptions I see what you mean. I'll try to clarify. 1) happens when the pthread_create fails. Resources used up basically. It's a normal error condition. 2) happens when the fork doesn't return. The last message that is seen is "forking". No messages following it are seen, and no messages from the main program are seen. 3) happens when the fork returns but has failed. The last message that is seen is "done here" after the "Unable to fork". I've tracked what happens after the "done here" message and the thread is exiting. So that would seem the hang is in the main program. Why are you creating a thread just to fork/exec another process? Our main application handles requests from a named socket. Some of the requests call shell scripts. Most of these shell scripts can send more requests to the application (I didn't write this, I just have to maintain it ). So for those requests that call shell scripts the application has to create a thread and within the thread fork and then exec. Pedantically, I believe you are supposed to call _exit, not exit, if fork fails as stated here in the Solaris man page for fork: An applications should call _exit() rather than exit(3C) if it cannot execve(), since exit() will flush and close stan- dard I/O channels and thereby corrupt the parent process's standard I/O data structures. Using exit(3C) will flush buf- fered data twice. See exit(2). This is good to know because the same application also runs on solaris. Although, it seems to run fine there. I don't know, however, if this is really true in Cygwin, but it might explain some misdiagnosed hangs on your part. Also, the execve call appears to be suspect. Again, the Solaris man page for execve states: The value in argv[0] should point to a filename that is associated with the process being started by one of the exec functions. [snip] As indicated, argc is at least one and the first member of the array points to a string containing the name of the file. Attached is a modified test case that fixes a few of these issues, but still hangs (or stutters; it does appear to proceed after long periods of time). I've modified my test case to make sure that execve has valid arguments, but I still get the hang. FWIW, execve is being used because of the shell scripts being called. Peter -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: hang when using pthread and fork in 1.5.23-1 and snapshot 20070118.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Peter Rehley wrote: > Hello, > > One of the applications I've been working with has hanging issues. It > will sometimes work properly, and sometimes it will hang and never > continue through the rest of the program. I have not done any pthread programming under cygwin, but I have done a fair bit under Solaris. In my experience threads and fork are not good bedfellows, you need to excercise care in order to avoid deadlock. - From the solaris fork() man page : fork() Safety If a multithreaded application calls fork() or fork1(), and the child does more than simply call one of the exec(2) functions, there is a possibility of deadlock occurring in the child. The application should use pthread_atfork(3C) to ensure safety with respect to this deadlock. Should there be any outstanding mutexes throughout the process, the applica- tion should call pthread_atfork() to wait for and acquire those mutexes prior to calling fork() or fork1(). See "MT- Level of Libraries" on the attributes(5) manual page. Using stdio in the child after fork in a multithreaded apps has caused me pain on many occasions, also std::string in c++. A recommended way to deal with this that I have seen on the web is to spawn a process before any threads to handle the forks, and use pipes to communicate between the threads and the forking process. - -- Al Slater Technical Director Stanton Consultancy Ltd -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFFwLEPz4fTOFL/EDYRAmIEAJ43G/LidV+qDdG9Yr2CdxJ2B2L/lwCfTHfI D1/DfKCQpuda8Kw2OTii51k= =OaGK -END PGP SIGNATURE- -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: hang when using pthread and fork in 1.5.23-1 and snapshot 20070118, and now 1.5.24-1.
On Fri, 26 Jan 2007, Peter Rehley wrote: > Hello, > > I tried the latest release of cygwin1.dll (1.5.24-1) and it still is > hanging in the same way. I've tried to debug further with gdb, but > so far I haven't got any useful information out of gdb. > > I'll keep trying to get some debug information, but if any one else > can reproduce the problem I would be most appreciative. I can reproduce a problem. Your descriptions of it are a bit hard to follow, so I'm not sure if it is your problem or not. Unfortunately, I don't have time to debug it right now. I do have a few comments, though. Why are you creating a thread just to fork/exec another process? Pedantically, I believe you are supposed to call _exit, not exit, if fork fails as stated here in the Solaris man page for fork: An applications should call _exit() rather than exit(3C) if it cannot execve(), since exit() will flush and close stan- dard I/O channels and thereby corrupt the parent process's standard I/O data structures. Using exit(3C) will flush buf- fered data twice. See exit(2). I don't know, however, if this is really true in Cygwin, but it might explain some misdiagnosed hangs on your part. Also, the execve call appears to be suspect. Again, the Solaris man page for execve states: The value in argv[0] should point to a filename that is associated with the process being started by one of the exec functions. [snip] As indicated, argc is at least one and the first member of the array points to a string containing the name of the file. Attached is a modified test case that fixes a few of these issues, but still hangs (or stutters; it does appear to proceed after long periods of time). -- Brian Ford Lead Realtime Software Engineer VITAL - Visual Simulation Systems FlightSafety International the best safety device in any aircraft is a well-trained crew.../* main.cc * */ #include #include #include #include #include void usage() { printf("Usage:\n"); printf("-p - use program instead of /bin/ls \n"); exit(1); } void * forkit2me(void *data) { pid_t pid; printf("forking\n"); if ((pid = fork()) < 0 ) { printf("Unable to fork\n"); _exit(1); } else if (pid == 0 ) { printf ("child: %s\n", (char *)data); execle((char *)data,(char *)data, NULL,NULL); perror("exec failed"); _exit(1); } else { printf ("parent\n"); } printf ("done here\n"); return data; } int main(int argc, char * argv[]) { int quit=0; static char * prog2run = "/bin/ls"; if (argc > 0) { for (int i=1; i < argc; i++) { if ( argv[i][0]== '-' ) { switch (argv[i][1]) { case 'p': if ( i+1 < argc ) { prog2run=argv[i+1]; i++; } break; default: usage(); } } } } fcntl( fileno( stdout ), F_SETFD, 1 ); fcntl( fileno( stderr ), F_SETFD, 1 ); int rc; pthread_attr_t ta; pthread_t threadId; rc=pthread_attr_init(&ta); if (rc) { printf("pthread_attr_init failed: rc (%d)\n",rc); return 1; } rc=pthread_attr_setdetachstate(&ta,PTHREAD_CREATE_DETACHED); if (rc) { printf("pthread_attr_setdetachstate failed: rc (%d)\n",rc); return 1; } while (!quit) { printf("here we go\n"); #ifdef PRFAIL printf("creating thread\n"); #endif rc=pthread_create(&threadId, &ta, forkit2me, prog2run); if (rc) { printf("pthread_create failed: rc (%d)\n",rc); break; } #ifdef PRFAIL printf("created\n"); #endif } return 0; } -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/
Re: hang when using pthread and fork in 1.5.23-1 and snapshot 20070118, and now 1.5.24-1.
Hello, I tried the latest release of cygwin1.dll (1.5.24-1) and it still is hanging in the same way. I've tried to debug further with gdb, but so far I haven't got any useful information out of gdb. I'll keep trying to get some debug information, but if any one else can reproduce the problem I would be most appreciative. Thanks, Peter p.s. my machines spec's are windows xp, sp2, 2.93 GHz, 760MB ram. windows xp, sp1 2.39 GHz, 508MB ram. both are single processor units. On Jan 19, 2007, at 6:03 PM, Peter Rehley wrote: Hello, One of the applications I've been working with has hanging issues. It will sometimes work properly, and sometimes it will hang and never continue through the rest of the program. I've created a simple test case that does some of what the application does, and it will hang too. The test case has a loop that continually creates a pthread. The pthread calls a function that forks and execve's to another program. Eventually the main program will be unable to fork, and it will hang inside of the pthread after the thread's function has completed. However, I can also get two other different results depending on how the program is compiled and run. 1) pthread_create failed : rc 11 - valid error. build with "g++ -DPRFAIL main.cc" and run without redirecting output. Adds additional printf statements to output 2) fork called but never returns. one hang situation. build with g++ main.cc and run with redirecting output to a file. 3) Unable to create fork, but program doesn't appear to leave thread and program hangs. build with g++ main.cc and run without redirecting output. I suspect, maybe incorrectly, that the hangs are race conditions. I'm hoping that someone will be able to take the test case and be able to reproduce what I'm seeing. The machine is a fresh install of windows xp only. No webcam drivers or other known programs that interact badly with cygwin. I have AVG antivirus installed but even with it uninstalled the program can still hang. I've attached the cygcheck output and the simple test case. Thanks, Peter -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/ -- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/