Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 10:57:31PM +0400, Denis Bilenko 
 wrote:
> No, it's not that. Attached is the program that checks the error codes
> and also waits for a child using child watcher. Still fails.

Now you changed your program considerably, and I am too lazy/sleepy to
track down further bugs in it.

If I modify your original program in the way I said, to close the extra
fd, call waitpid on the child and destroy the default loop before the next
fork, it runs fine here.

This is the end result, in case my description of my changes wasnt clear:
http://data.plan9.de/testsigchld.c

You could use that as starting point, although my recommendations from my
first reply are still something you should consider seriously :)

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Denis Bilenko
On Tue, May 29, 2012 at 10:08 PM, Marc Lehmann  wrote:
> You also don't call waitpid on your children - can you add error checking
> to all functions you call, to make sure you really do wait for your
> children (especially check fork returns)? Most likely you run out of
> process slots and fork fails, which your code "misinterprets" as a hang.

No, it's not that. Attached is the program that checks the error codes
and also waits for a child using child watcher. Still fails.
/* gcc -DEV_STANDALONE=1 testsigchld.c -o testsigchld 
 *
 * Expected output: infinite sequence of "*."
 *
 * Actual output:
 *   denis@denis-laptop:~/work/libev-cvs$ ./testsigchld 
 *   *.*.*.Alarm clock
 *
 *   (number of iterations is different each time)
 *
 * */
#include "ev.c"


void stop_child(struct ev_loop* loop, ev_child *w, int revents) {
ev_child_stop(loop, w);
}

int CHECK(int retcode) {
if (retcode < 0) {
perror("fail");
_exit(1);
}
return retcode;
}

void subprocess(void) {
int pid;
if (pid = CHECK(fork())) {
struct ev_child child;
ev_child_init(&child, stop_child, pid, 0);
ev_child_start(EV_DEFAULT, &child);
ev_run(EV_DEFAULT, 0);
CHECK(fprintf(stderr, "*"));
}
else {
_exit(0);
}
}


void test_main(void) {
int pid = CHECK(fork());
if (!pid) {
ev_loop_fork(EV_DEFAULT);
subprocess();
_exit(0);
}

alarm(5);

struct ev_child child;
ev_child_init(&child, stop_child, pid, 0);
ev_child_start(EV_DEFAULT, &child);
ev_run(EV_DEFAULT, 0);

alarm(0);
}

int main(int argc, char** argv) {
ev_default_loop(EVBACKEND_SELECT);
while (1) {
test_main();
CHECK(fprintf(stderr, "."));
}
}
___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev

Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 09:57:30PM +0400, Denis Bilenko 
 wrote:
> On Tue, May 29, 2012 at 7:12 PM, Marc Lehmann  wrote:
> > This one - your test program forks after initialising the default loop,
> > without calling ev_default_fork.
> 
> OK, I've fixed the test program to do that and also fixed a fd leak.
> It takes a bit longer to fail now, but it still fails.

You also don't call waitpid on your children - can you add error checking
to all functions you call, to make sure you really do wait for your
children (especially check fork returns)? Most likely you run out of
process slots and fork fails, which your code "misinterprets" as a hang.

> > Note that this only works because the fork isn't done while the default
> > loop exists at the time - if you would fork while the dfeault loop
> > existed, you'd have to work with ev_default_fork and probably stop all
> > watchers you inherited form the parent.
> 
> At the time of fork there no active watchers in the test program.

Well, there are, inside libev. But that's pretty irrelevant, the
documentation doesn't say you can ignore the fork if you don't have any
active watchers.

> To rule out epoll issues, I'm now using EVBACKEND_SELECT explicitly -
> still fails. There no pthreads either, so it has to be something else.

You still have to call ev_default_fork before you can reuse a loop in the
child, there are no exceptions.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Denis Bilenko
On Tue, May 29, 2012 at 7:12 PM, Marc Lehmann  wrote:
> This one - your test program forks after initialising the default loop,
> without calling ev_default_fork.

OK, I've fixed the test program to do that and also fixed a fd leak.
It takes a bit longer to fail now, but it still fails.

> I added an ev_loop_destroy (EV_DEFAULT) at the end of test_main, and got a
> lot longer output, until:
>
> (libev) error creating signal/async pipe: Too many open files

With fd leak fixed, this should no longer happen. I've tried adding
ev_loop_destroy too - it still fails.

> Note that this only works because the fork isn't done while the default
> loop exists at the time - if you would fork while the dfeault loop
> existed, you'd have to work with ev_default_fork and probably stop all
> watchers you inherited form the parent.

At the time of fork there no active watchers in the test program.

> Now, the fork business is very unfortunate, but both epoll/kqueue and
> pthreads have diminished fork into a state where using an event loop in
> both parent ands child has become extreely hard (actually, doing anything
> in the child is hard with pthreads).

To rule out epoll issues, I'm now using EVBACKEND_SELECT explicitly -
still fails. There no pthreads either, so it has to be something else.

Attached is modified version with all the fixes mentioned above.
Please try it as well :) I've also tried using EVFLAG_FORKCHECK but
that did not help either.
/* gcc -DEV_STANDALONE=1 testsigchld.c -o testsigchld 
 *
 * Expected output: infinite sequence of *.*.*...
 *
 * Actual output:
 *   denis@denis-laptop:~/work/libev-cvs$ ./testsigchld 
 *   *.*.*.Alarm clock
 *
 *   (number of iterations is different each time)
 *
 * */
#include "ev.c"


struct ev_child child;

void stop_child(struct ev_loop* loop, ev_child *w, int revents) {
ev_child_stop(loop, w);
}

void stop_io(struct ev_loop* loop, ev_io *w, int revents) {
ev_io_stop(loop, w);
}


void subprocess(void) {
int pid;
if (pid = fork()) {
ev_child_init(&child, stop_child, pid, 0);
ev_child_start(EV_DEFAULT, &child);
ev_run(EV_DEFAULT, 0);
fprintf(stderr, "*");
}
else {
_exit(0);
}
}


void test_main(void) {
int pipefd[2];

while (pipe(pipefd)) perror("pipe");

int pid = fork();
if (!pid) {
ev_loop_fork(EV_DEFAULT);
close(pipefd[0]);
subprocess();
write(pipefd[1], "k", 1);
_exit(0);
}

close(pipefd[1]);
alarm(5);

struct ev_io io;
ev_io_init(&io, stop_io, pipefd[0], 1);
ev_io_start(EV_DEFAULT, &io);
ev_run(EV_DEFAULT, 0);
close(pipefd[0]);

alarm(0);
}

int main(int argc, char** argv) {
ev_default_loop(EVBACKEND_SELECT);
while (1) {
test_main();
fprintf(stderr, ".");
}
}
___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev

Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 05:29:20PM +0200, Gabriel Kerneis 
 wrote:
> I do not understand what you mean here, probably because I never tried to mix
> fork, pthreads and epoll in a single program.  Could you please provide some

Oh, it's quite independent of each other, too, i.e. you don't have to use
them together to get the problems.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 05:29:20PM +0200, Gabriel Kerneis 
 wrote:
> I do not understand what you mean here, probably because I never tried to mix
> fork, pthreads and epoll in a single program.  Could you please provide some
> more details,

epoll: epoll file descriptors are available in the child, and events are
delivered to both, even for file descriptors that are closed in one of the
processes. since you need an fd to remove fds from an epoll set, you can't
remove that fd (libev uses a heuristic to detetc this case and then recreates
the whole epoll ste, which cna be slow)

kqueue: the kqueue fd is close'd during fork (yes, missing :), or disabled
(yes, fd is still there) and event libraries cannot reallx distinguish
between the cases. this is not as bad as epoll, but still requires special
fork support from event libraries.

pthreads: after pthread_create, the child environment after fork is
as restricted as in a signal handler, i.e. after fork, you cannot use
malloc/printf or anything else thats unsafe in a signal handler.

the result of this is that it's no fun at all to use fork for
multiprocessing.

(not that fork ever was so good for multiprocessing when doing something
more complicated than a preforked server).

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Gabriel Kerneis
Hi Marc,

On Tue, May 29, 2012 at 05:12:22PM +0200, Marc Lehmann wrote:
> Now, the fork business is very unfortunate, but both epoll/kqueue and
> pthreads have diminished fork into a state where using an event loop in
> both parent ands child has become extreely hard (actually, doing anything
> in the child is hard with pthreads).

I do not understand what you mean here, probably because I never tried to mix
fork, pthreads and epoll in a single program.  Could you please provide some
more details, or pointers to the relevant documentation?

Many thanks,
-- 
Gabriel Kerneis

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 04:31:05PM +0400, Denis Bilenko 
 wrote:
> I have a weird test case (attached) where SIGCHLD is not being
> received by libev. I don't quite understand if it's a

Well, thanks for a simple-to-try test program :)

> 1) bug in how I use libev

This one - your test program forks after initialising the default loop,
without calling ev_default_fork.

I added an ev_loop_destroy (EV_DEFAULT) at the end of test_main, and got a
lot longer output, until:

(libev) error creating signal/async pipe: Too many open files

Which is probably a different bug in the test program.

Note that this only works because the fork isn't done while the default
loop exists at the time - if you would fork while the dfeault loop
existed, you'd have to work with ev_default_fork and probably stop all
watchers you inherited form the parent.

Now, the fork business is very unfortunate, but both epoll/kqueue and
pthreads have diminished fork into a state where using an event loop in
both parent ands child has become extreely hard (actually, doing anything
in the child is hard with pthreads).

If you plan to design a new application, it would probably be much easier in
the long run if you either:

a) only use libev in your "worker" processes.
b) fork+exec worker processes, and use libev freely in both (avoiding
   epoll if you fork often in the parent).

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


SIGCHLD not received

2012-05-29 Thread Denis Bilenko
Hi,

I have a weird test case (attached) where SIGCHLD is not being
received by libev. I don't quite understand if it's a

1) bug in how I use libev
2) bug in libev itself
3) bug in the OS

The work around I found that makes this test pass is to patch libev to
start a timer (active only when there are active child watchers) which
calls childcb periodically.

I tested it against the latest libev from CVS.

uname -a: Linux denis-laptop 3.2.0-23-generic-pae #36-Ubuntu SMP Tue
Apr 10 22:19:09 UTC 2012 i686 athlon i386 GNU/Linux

Cheers,
Denis.
/* gcc -DEV_STANDALONE=1 testsigchld.c -o testsigchld 
 *
 * Expected output: infinite sequence of *.*.*...
 *
 * Actual output:
 *   denis@denis-laptop:~/work/libev-cvs$ ./testsigchld 
 *   *.*.*.Alarm clock
 *
 *   (number of iterations is different each time)
 *
 * */
#include "ev.c"


struct ev_child child;

void stop_child(struct ev_loop* loop, ev_child *w, int revents) {
ev_child_stop(loop, w);
}

void stop_io(struct ev_loop* loop, ev_io *w, int revents) {
ev_io_stop(loop, w);
}


void subprocess(void) {
int pid;
ev_default_loop(0);
if (pid = fork()) {
ev_child_init(&child, stop_child, pid, 0);
ev_child_start(ev_default_loop(0), &child);
ev_run(ev_default_loop(0), 0);
fprintf(stderr, "*");
}
else {
_exit(0);
}
}


void test_main(void) {
int pipefd[2];

while (pipe(pipefd)) perror("pipe");

int pid = fork();
if (!pid) {
close(pipefd[0]);
subprocess();
write(pipefd[1], "k", 1);
_exit(0);
}

close(pipefd[1]);
alarm(5);

struct ev_io io;
ev_io_init(&io, stop_io, pipefd[0], 1);
ev_io_start(ev_default_loop(0), &io);
ev_run(ev_default_loop(0), 0);

alarm(0);
}

int main(int argc, char** argv) {
while (1) {
test_main();
fprintf(stderr, ".");
}
}
___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev