Re: SIGCHLD not received
On Tue, May 29, 2012 at 10:57:31PM +0400, Denis Bilenko wrote: > No, it's not that. Attached is the program that checks the error codes > and also waits for a child using child watcher. Still fails. Now you changed your program considerably, and I am too lazy/sleepy to track down further bugs in it. If I modify your original program in the way I said, to close the extra fd, call waitpid on the child and destroy the default loop before the next fork, it runs fine here. This is the end result, in case my description of my changes wasnt clear: http://data.plan9.de/testsigchld.c You could use that as starting point, although my recommendations from my first reply are still something you should consider seriously :) -- The choice of a Deliantra, the free code+content MORPG -==- _GNU_ http://www.deliantra.net ==-- _ generation ---==---(_)__ __ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schm...@schmorp.de -=/_/_//_/\_,_/ /_/\_\ ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: SIGCHLD not received
On Tue, May 29, 2012 at 10:08 PM, Marc Lehmann wrote: > You also don't call waitpid on your children - can you add error checking > to all functions you call, to make sure you really do wait for your > children (especially check fork returns)? Most likely you run out of > process slots and fork fails, which your code "misinterprets" as a hang. No, it's not that. Attached is the program that checks the error codes and also waits for a child using child watcher. Still fails. /* gcc -DEV_STANDALONE=1 testsigchld.c -o testsigchld * * Expected output: infinite sequence of "*." * * Actual output: * denis@denis-laptop:~/work/libev-cvs$ ./testsigchld * *.*.*.Alarm clock * * (number of iterations is different each time) * * */ #include "ev.c" void stop_child(struct ev_loop* loop, ev_child *w, int revents) { ev_child_stop(loop, w); } int CHECK(int retcode) { if (retcode < 0) { perror("fail"); _exit(1); } return retcode; } void subprocess(void) { int pid; if (pid = CHECK(fork())) { struct ev_child child; ev_child_init(&child, stop_child, pid, 0); ev_child_start(EV_DEFAULT, &child); ev_run(EV_DEFAULT, 0); CHECK(fprintf(stderr, "*")); } else { _exit(0); } } void test_main(void) { int pid = CHECK(fork()); if (!pid) { ev_loop_fork(EV_DEFAULT); subprocess(); _exit(0); } alarm(5); struct ev_child child; ev_child_init(&child, stop_child, pid, 0); ev_child_start(EV_DEFAULT, &child); ev_run(EV_DEFAULT, 0); alarm(0); } int main(int argc, char** argv) { ev_default_loop(EVBACKEND_SELECT); while (1) { test_main(); CHECK(fprintf(stderr, ".")); } } ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: SIGCHLD not received
On Tue, May 29, 2012 at 09:57:30PM +0400, Denis Bilenko wrote: > On Tue, May 29, 2012 at 7:12 PM, Marc Lehmann wrote: > > This one - your test program forks after initialising the default loop, > > without calling ev_default_fork. > > OK, I've fixed the test program to do that and also fixed a fd leak. > It takes a bit longer to fail now, but it still fails. You also don't call waitpid on your children - can you add error checking to all functions you call, to make sure you really do wait for your children (especially check fork returns)? Most likely you run out of process slots and fork fails, which your code "misinterprets" as a hang. > > Note that this only works because the fork isn't done while the default > > loop exists at the time - if you would fork while the dfeault loop > > existed, you'd have to work with ev_default_fork and probably stop all > > watchers you inherited form the parent. > > At the time of fork there no active watchers in the test program. Well, there are, inside libev. But that's pretty irrelevant, the documentation doesn't say you can ignore the fork if you don't have any active watchers. > To rule out epoll issues, I'm now using EVBACKEND_SELECT explicitly - > still fails. There no pthreads either, so it has to be something else. You still have to call ev_default_fork before you can reuse a loop in the child, there are no exceptions. -- The choice of a Deliantra, the free code+content MORPG -==- _GNU_ http://www.deliantra.net ==-- _ generation ---==---(_)__ __ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schm...@schmorp.de -=/_/_//_/\_,_/ /_/\_\ ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: SIGCHLD not received
On Tue, May 29, 2012 at 7:12 PM, Marc Lehmann wrote: > This one - your test program forks after initialising the default loop, > without calling ev_default_fork. OK, I've fixed the test program to do that and also fixed a fd leak. It takes a bit longer to fail now, but it still fails. > I added an ev_loop_destroy (EV_DEFAULT) at the end of test_main, and got a > lot longer output, until: > > (libev) error creating signal/async pipe: Too many open files With fd leak fixed, this should no longer happen. I've tried adding ev_loop_destroy too - it still fails. > Note that this only works because the fork isn't done while the default > loop exists at the time - if you would fork while the dfeault loop > existed, you'd have to work with ev_default_fork and probably stop all > watchers you inherited form the parent. At the time of fork there no active watchers in the test program. > Now, the fork business is very unfortunate, but both epoll/kqueue and > pthreads have diminished fork into a state where using an event loop in > both parent ands child has become extreely hard (actually, doing anything > in the child is hard with pthreads). To rule out epoll issues, I'm now using EVBACKEND_SELECT explicitly - still fails. There no pthreads either, so it has to be something else. Attached is modified version with all the fixes mentioned above. Please try it as well :) I've also tried using EVFLAG_FORKCHECK but that did not help either. /* gcc -DEV_STANDALONE=1 testsigchld.c -o testsigchld * * Expected output: infinite sequence of *.*.*... * * Actual output: * denis@denis-laptop:~/work/libev-cvs$ ./testsigchld * *.*.*.Alarm clock * * (number of iterations is different each time) * * */ #include "ev.c" struct ev_child child; void stop_child(struct ev_loop* loop, ev_child *w, int revents) { ev_child_stop(loop, w); } void stop_io(struct ev_loop* loop, ev_io *w, int revents) { ev_io_stop(loop, w); } void subprocess(void) { int pid; if (pid = fork()) { ev_child_init(&child, stop_child, pid, 0); ev_child_start(EV_DEFAULT, &child); ev_run(EV_DEFAULT, 0); fprintf(stderr, "*"); } else { _exit(0); } } void test_main(void) { int pipefd[2]; while (pipe(pipefd)) perror("pipe"); int pid = fork(); if (!pid) { ev_loop_fork(EV_DEFAULT); close(pipefd[0]); subprocess(); write(pipefd[1], "k", 1); _exit(0); } close(pipefd[1]); alarm(5); struct ev_io io; ev_io_init(&io, stop_io, pipefd[0], 1); ev_io_start(EV_DEFAULT, &io); ev_run(EV_DEFAULT, 0); close(pipefd[0]); alarm(0); } int main(int argc, char** argv) { ev_default_loop(EVBACKEND_SELECT); while (1) { test_main(); fprintf(stderr, "."); } } ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: SIGCHLD not received
On Tue, May 29, 2012 at 05:29:20PM +0200, Gabriel Kerneis wrote: > I do not understand what you mean here, probably because I never tried to mix > fork, pthreads and epoll in a single program. Could you please provide some Oh, it's quite independent of each other, too, i.e. you don't have to use them together to get the problems. -- The choice of a Deliantra, the free code+content MORPG -==- _GNU_ http://www.deliantra.net ==-- _ generation ---==---(_)__ __ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schm...@schmorp.de -=/_/_//_/\_,_/ /_/\_\ ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: SIGCHLD not received
On Tue, May 29, 2012 at 05:29:20PM +0200, Gabriel Kerneis wrote: > I do not understand what you mean here, probably because I never tried to mix > fork, pthreads and epoll in a single program. Could you please provide some > more details, epoll: epoll file descriptors are available in the child, and events are delivered to both, even for file descriptors that are closed in one of the processes. since you need an fd to remove fds from an epoll set, you can't remove that fd (libev uses a heuristic to detetc this case and then recreates the whole epoll ste, which cna be slow) kqueue: the kqueue fd is close'd during fork (yes, missing :), or disabled (yes, fd is still there) and event libraries cannot reallx distinguish between the cases. this is not as bad as epoll, but still requires special fork support from event libraries. pthreads: after pthread_create, the child environment after fork is as restricted as in a signal handler, i.e. after fork, you cannot use malloc/printf or anything else thats unsafe in a signal handler. the result of this is that it's no fun at all to use fork for multiprocessing. (not that fork ever was so good for multiprocessing when doing something more complicated than a preforked server). -- The choice of a Deliantra, the free code+content MORPG -==- _GNU_ http://www.deliantra.net ==-- _ generation ---==---(_)__ __ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schm...@schmorp.de -=/_/_//_/\_,_/ /_/\_\ ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: SIGCHLD not received
Hi Marc, On Tue, May 29, 2012 at 05:12:22PM +0200, Marc Lehmann wrote: > Now, the fork business is very unfortunate, but both epoll/kqueue and > pthreads have diminished fork into a state where using an event loop in > both parent ands child has become extreely hard (actually, doing anything > in the child is hard with pthreads). I do not understand what you mean here, probably because I never tried to mix fork, pthreads and epoll in a single program. Could you please provide some more details, or pointers to the relevant documentation? Many thanks, -- Gabriel Kerneis ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
Re: SIGCHLD not received
On Tue, May 29, 2012 at 04:31:05PM +0400, Denis Bilenko wrote: > I have a weird test case (attached) where SIGCHLD is not being > received by libev. I don't quite understand if it's a Well, thanks for a simple-to-try test program :) > 1) bug in how I use libev This one - your test program forks after initialising the default loop, without calling ev_default_fork. I added an ev_loop_destroy (EV_DEFAULT) at the end of test_main, and got a lot longer output, until: (libev) error creating signal/async pipe: Too many open files Which is probably a different bug in the test program. Note that this only works because the fork isn't done while the default loop exists at the time - if you would fork while the dfeault loop existed, you'd have to work with ev_default_fork and probably stop all watchers you inherited form the parent. Now, the fork business is very unfortunate, but both epoll/kqueue and pthreads have diminished fork into a state where using an event loop in both parent ands child has become extreely hard (actually, doing anything in the child is hard with pthreads). If you plan to design a new application, it would probably be much easier in the long run if you either: a) only use libev in your "worker" processes. b) fork+exec worker processes, and use libev freely in both (avoiding epoll if you fork often in the parent). -- The choice of a Deliantra, the free code+content MORPG -==- _GNU_ http://www.deliantra.net ==-- _ generation ---==---(_)__ __ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schm...@schmorp.de -=/_/_//_/\_,_/ /_/\_\ ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev
SIGCHLD not received
Hi, I have a weird test case (attached) where SIGCHLD is not being received by libev. I don't quite understand if it's a 1) bug in how I use libev 2) bug in libev itself 3) bug in the OS The work around I found that makes this test pass is to patch libev to start a timer (active only when there are active child watchers) which calls childcb periodically. I tested it against the latest libev from CVS. uname -a: Linux denis-laptop 3.2.0-23-generic-pae #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012 i686 athlon i386 GNU/Linux Cheers, Denis. /* gcc -DEV_STANDALONE=1 testsigchld.c -o testsigchld * * Expected output: infinite sequence of *.*.*... * * Actual output: * denis@denis-laptop:~/work/libev-cvs$ ./testsigchld * *.*.*.Alarm clock * * (number of iterations is different each time) * * */ #include "ev.c" struct ev_child child; void stop_child(struct ev_loop* loop, ev_child *w, int revents) { ev_child_stop(loop, w); } void stop_io(struct ev_loop* loop, ev_io *w, int revents) { ev_io_stop(loop, w); } void subprocess(void) { int pid; ev_default_loop(0); if (pid = fork()) { ev_child_init(&child, stop_child, pid, 0); ev_child_start(ev_default_loop(0), &child); ev_run(ev_default_loop(0), 0); fprintf(stderr, "*"); } else { _exit(0); } } void test_main(void) { int pipefd[2]; while (pipe(pipefd)) perror("pipe"); int pid = fork(); if (!pid) { close(pipefd[0]); subprocess(); write(pipefd[1], "k", 1); _exit(0); } close(pipefd[1]); alarm(5); struct ev_io io; ev_io_init(&io, stop_io, pipefd[0], 1); ev_io_start(ev_default_loop(0), &io); ev_run(ev_default_loop(0), 0); alarm(0); } int main(int argc, char** argv) { while (1) { test_main(); fprintf(stderr, "."); } } ___ libev mailing list libev@lists.schmorp.de http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev