Re: SIGCHLD not received

2012-07-25 Thread Denis Bilenko
On Tue, Jul 24, 2012 at 8:56 PM, Marc Lehmann  wrote:
>>
>> in loop_fork() you call
>
> Hm, right.
>
> Can you try the CVS version of libev? That one doesn't rely on this check
> anymore, and will probably just work (the test in the signal handler relied
> on int being atomic and was actually a temporary bug workaround, which the
> cvs version should fix)

Yes, it appears to be fixed now.

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-07-24 Thread Marc Lehmann
On Tue, Jul 24, 2012 at 12:06:34PM +0400, Denis Bilenko 
 wrote:
> Expected output:
> 
> *. goes on infinitely

That's what I seem to get, but that's to be expected from races, which is
why I asked.

> > Can you explain the mechanism or nature of that race condition?
> 
> in loop_fork() you call

Hm, right.

Can you try the CVS version of libev? That one doesn't rely on this check
anymore, and will probably just work (the test in the signal handler relied
on int being atomic and was actually a temporary bug workaround, which the
cvs version should fix)

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-07-24 Thread Denis Bilenko
On Mon, Jul 23, 2012 at 1:24 PM, Marc Lehmann  wrote:
> Hmm, it's been a long time - what exactly is the behaviour I am supposed to
> see from the program without the sleep, and what behaviour do you get?

Here's the program with sleep removed (no further changes necessary),
just tested it against libev in CVS: https://gist.github.com/3168594

Actual output:

denis@ubuntu:~/work/libev-cvs$ ./testsigchld5
*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.*.Alarm clock

Expected output:

*. goes on infinitely

>> The source of the issue is a race condition between ev_feed_signal and
>> loop_fork(). If I block signals at the beginning of loop_fork() and
>> unblock them at the end I have it fixed.

> Can you explain the mechanism or nature of that race condition?

in loop_fork() you call

  ev_io_stop (EV_A_ &pipe_w);
  // signal delivered right here

in ev_feed_signal() you check

  if (!ev_active (&pipe_w))
return;

thus ev_feed_signal() handler becomes noop.

>> Do you acknowledge this to be a bug in libev?
>
> If it is a bug, I will of course acknowledge it - that's a weird question
> to ask(?).

It's a very real question, depending on the answer I either can
continue to use stock libev or have to maintain my own slightly
modified branch, which is a nuisance.

>
>> Do you need help with fixing it?
>
> I need help understanding it, to see if it's really a bug - keep in mind
> that you made a lot of invalid reports, and it takes time to sift through
> all and find out if there really is an issue.

Just use the last test program (in this email).

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-07-23 Thread Marc Lehmann
On Mon, Jul 23, 2012 at 12:15:57PM +0400, Denis Bilenko 
 wrote:
> > There's another issue, in which libev does lose the signal forever. To
> > reproduce it, remove sleep(1) from the latest test program.

Hmm, it's been a long time - what exactly is the behaviour I am supposed to
see from the program without the sleep, and what behaviour do you get?

> > The source of the issue is a race condition between ev_feed_signal and
> > loop_fork(). If I block signals at the beginning of loop_fork() and
> > unblock them at the end I have it fixed.

Can you explain the mechanism or nature of that race condition?

> Do you acknowledge this to be a bug in libev?

If it is a bug, I will of course acknowledge it - that's a weird question
to ask(?).

> Do you need help with fixing it?

I need help understanding it, to see if it's really a bug - keep in mind
that you made a lot of invalid reports, and it takes time to sift through
all and find out if there really is an issue.

PS: your patch was garbled (bnut humanly-readable) in that e-mail.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-07-23 Thread Denis Bilenko
On Thu, May 31, 2012 at 10:32 PM, Denis Bilenko  wrote:
> On Thu, May 31, 2012 at 9:13 PM, Marc Lehmann  wrote:
>>> The SIGCHLD is received, of course, but then libev fails to notify the
>>> child watcher in a timely manner. Instead, it delays the notification
>>> by 60 seconds (MAX_BLOCKTIME).
>>
>> I'll have a look, however, from looking at it, the test will always fail
>> if the system is busy enough, and libev doesn't lose the signal, so
>> behaviour is essentially correct. I'll see if this case can be optimised
>> without drawback for other cases.
>
> Cool, thanks.
>
> There's another issue, in which libev does lose the signal forever. To
> reproduce it, remove sleep(1) from the latest test program.
>
> The source of the issue is a race condition between ev_feed_signal and
> loop_fork(). If I block signals at the beginning of loop_fork() and
> unblock them at the end I have it fixed.

Marc, what's your resolution on this? Do you acknowledge this to be a
bug in libev? Do you need help with fixing it?

I'm asking about the issue where the event is lost forever, not the
one where it's delayed for MAX_BLOCKTIME time.

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-31 Thread Denis Bilenko
On Thu, May 31, 2012 at 9:13 PM, Marc Lehmann  wrote:
>> The SIGCHLD is received, of course, but then libev fails to notify the
>> child watcher in a timely manner. Instead, it delays the notification
>> by 60 seconds (MAX_BLOCKTIME).
>
> I'll have a look, however, from looking at it, the test will always fail
> if the system is busy enough, and libev doesn't lose the signal, so
> behaviour is essentially correct. I'll see if this case can be optimised
> without drawback for other cases.

Cool, thanks.

There's another issue, in which libev does lose the signal forever. To
reproduce it, remove sleep(1) from the latest test program.

The source of the issue is a race condition between ev_feed_signal and
loop_fork(). If I block signals at the beginning of loop_fork() and
unblock them at the end I have it fixed.

Here's what I do:

diff -u -r1.442 ev.c
--- ev.c31 May 2012 15:47:59 -  1.442
+++ ev.c31 May 2012 18:22:12 -
@@ -467,7 +467,9 @@
 /*#define MIN_INTERVAL  0.0095367431640625 /* 1/2**20, good till 2200 */

 #define MIN_TIMEJUMP  1. /* minimum timejump that gets detected (if
monotonic clock available) */
+#ifndef MAX_BLOCKTIME
 #define MAX_BLOCKTIME 59.743 /* never wait longer than this time (to
detect time jumps) */
+#endif

 #define EV_TV_SET(tv,t) do { tv.tv_sec = (long)t; tv.tv_usec =
(long)((t - tv.tv_sec) * 1e6); } while (0)
 #define EV_TS_SET(ts,t) do { ts.tv_sec = (long)t; ts.tv_nsec =
(long)((t - ts.tv_sec) * 1e9); } while (0)
@@ -2519,6 +2521,10 @@
 inline_size void
 loop_fork (EV_P)
 {
+  sigset_t mask, oldmask;
+  sigfillset(&mask);
+  int sigmask_error = sigprocmask(SIG_BLOCK, &mask, &oldmask);
+
 #if EV_USE_PORT
   if (backend == EVBACKEND_PORT  ) port_fork   (EV_A);
 #endif
@@ -2558,6 +2564,7 @@
 }

   postfork = 0;
+  if (!sigmask_error) sigprocmask(SIG_SETMASK, &oldmask, NULL);
 }

 #if EV_MULTIPLICITY


$ gcc -DMAX_BLOCKTIME=1 -DEV_STANDALONE=1 testsigchld4.c -o testsigchld4
$ ./testsigchld4

Setting MAX_BLOCKTIME to 1 is needed to suppress the first issue - the
two are unrelated.

Now if I run testsigchld4 it sometimes pauses for 1 second but never fails.

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-31 Thread Marc Lehmann
On Thu, May 31, 2012 at 12:39:24AM +0400, Denis Bilenko 
 wrote:
>  - not use the event loop in the main process; use waitpid()
>  - fail on first iteration
> 
> You can see it here: https://gist.github.com/gists/2838734
> 
> The SIGCHLD is received, of course, but then libev fails to notify the
> child watcher in a timely manner. Instead, it delays the notification
> by 60 seconds (MAX_BLOCKTIME).

I'll have a look, however, from looking at it, the test will always fail
if the system is busy enough, and libev doesn't lose the signal, so
behaviour is essentially correct. I'll see if this case can be optimised
without drawback for other cases.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-30 Thread Denis Bilenko
OK, I further simplified the test program to:

 - not use the event loop in the main process; use waitpid()
 - fail on first iteration

You can see it here: https://gist.github.com/gists/2838734

The SIGCHLD is received, of course, but then libev fails to notify the
child watcher in a timely manner. Instead, it delays the notification
by 60 seconds (MAX_BLOCKTIME).

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-30 Thread Marc Lehmann
On Wed, May 30, 2012 at 06:47:44PM +0400, Denis Bilenko 
 wrote:
> > The test program still forks with an existing loop,
> 
> Which is legal, otherwise what's the point of ev_loop_fork and fork
> watchers if not fork an existing event loop?

The point of ev_loop_fork is to be called.

> I think figured why it fails though - the child forked off in
> subprocess() is short-lived and can die before loop_fork() was
> actually executed by ev_run(). Merely calling ev_loop_fork() is not
> enough, but if it's followed by ev_run() (which exits immediately
> because there are no watchers) then it works.

The default loop must be initialised before any child exits, if that is
what the problem was, yes.

> So SIGCHLD was received but it wrote to a pipe end that was soon
> replaced by another pipe in loop_fork().

That can happen, but should be immaterial, as libev compensates for this by
always checking for signals and asyncs after it recreates the pipe.

It's also not what the test program does: in the test program, the _other_
default loop is destroyed on each loop run, and doesn't exist sometimes
when the child exits fast enough (presumably).


-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-30 Thread Denis Bilenko
On Wed, May 30, 2012 at 5:33 PM, Marc Lehmann  wrote:
> On Wed, May 30, 2012 at 02:55:52PM +0400, Denis Bilenko 
>  wrote:
>> Yes, it seems that waitpid() somehow masks or avoids the bug. If I
>> apply this small patch to your program it fails - why?
>
> The test program still forks with an existing loop,

Which is legal, otherwise what's the point of ev_loop_fork and fork
watchers if not fork an existing event loop?

I think figured why it fails though - the child forked off in
subprocess() is short-lived and can die before loop_fork() was
actually executed by ev_run(). Merely calling ev_loop_fork() is not
enough, but if it's followed by ev_run() (which exits immediately
because there are no watchers) then it works.

So SIGCHLD was received but it wrote to a pipe end that was soon
replaced by another pipe in loop_fork().

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-30 Thread Marc Lehmann
On Wed, May 30, 2012 at 02:55:52PM +0400, Denis Bilenko 
 wrote:
> Yes, it seems that waitpid() somehow masks or avoids the bug. If I
> apply this small patch to your program it fails - why?

The test program still forks with an existing loop, and when I run it
after the patch I quickly get fork errors in my other terminals, so it's
likely still buggy *somehow*.

Frankly, I don't see the purpose in debugging this program further, it's just
way too buggy to be useful as a test program.

> The recommendations are good - I have found empirically too that
> fork+exec is just more reliable than fork when you use libev loop

There is no difference in reliability with libev or not, and libev is
reliable with both fork and fork+exec. What you found out empirically is
that your test program is very, very buggy, not that fork is less reliable
when you use libev.

My recommendations were to not do it that way, because it is hard. And
not surprisingly, we have found at least four critical bugs in the test
program.

> the parent. However, this particular test program is extracted from
> python-gevent test suite, not a real application. I cannot change it
> much, except to fix bugs in libev usage, like call ev_loop_fork() when
> needed (which original Python program already did - it was lost during
> translation to C).

If you can only fix some bugs in it but not others, then you simply need
to live with the bugs.

> BTW, the original test case did not have an infinite loop, it just
> failed randomly, which suggests that the failure may happen on the first
> iteration.

Well, it was not the only bug in the test program.

> My solution so far has been to add a timer reaping the children
> periodically (attached). But I still have no idea what makes SIGCHLD
> being skipped in the first place.

There is no evidence whatsoever that sigchld has been skipped (or, if it
has been skipped, that this is caused by libev).

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-30 Thread Denis Bilenko
On Wed, May 30, 2012 at 1:29 AM, Marc Lehmann  wrote:
> On Tue, May 29, 2012 at 10:57:31PM +0400, Denis Bilenko 
>  wrote:
>> No, it's not that. Attached is the program that checks the error codes
>> and also waits for a child using child watcher. Still fails.
>
> Now you changed your program considerably, and I am too lazy/sleepy to
> track down further bugs in it.

Don't worry - I don't feel entitled to your time just because I use
your (free) software. I actually appreciate you taking time to reply
here (and to release libev in the first place - thanks!).

> If I modify your original program in the way I said, to close the extra
> fd, call waitpid on the child and destroy the default loop before the next
> fork, it runs fine here.

Yes, it seems that waitpid() somehow masks or avoids the bug. If I
apply this small patch to your program it fails - why? Wouldn't
ev_run() take care of waiting for the child for me the same way as the
removed waitpid() did?

--- testsigchld_marc_orig.c 2012-05-30 14:23:52.904252701 +0400
+++ testsigchld_marc.c  2012-05-30 14:20:08.263154490 +0400
@@ -55,12 +55,14 @@
 alarm(5);

 struct ev_io io;
+struct ev_child child;
 ev_io_init(&io, stop_io, pipefd[0], 1);
 ev_io_start(ev_default_loop(0), &io);
+ev_child_init(&child, stop_child, pid, 0);
+ev_child_start(ev_default_loop(0), &child);
 ev_run(ev_default_loop(0), 0);
 ev_loop_destroy (EV_DEFAULT);
 close(pipefd[0]);
-waitpid(pid, 0, 0);//D

 alarm(0);
 }

> You could use that as starting point, although my recommendations from my
> first reply are still something you should consider seriously :)

The recommendations are good - I have found empirically too that
fork+exec is just more reliable than fork when you use libev loop in
the parent. However, this particular test program is extracted from
python-gevent test suite, not a real application. I cannot change it
much, except to fix bugs in libev usage, like call ev_loop_fork() when
needed (which original Python program already did - it was lost during
translation to C). BTW, the original test case did not have an
infinite loop, it just failed randomly, which suggests that the
failure may happen on the first iteration.

My solution so far has been to add a timer reaping the children
periodically (attached). But I still have no idea what makes SIGCHLD
being skipped in the first place.


libev_childpollev.patch
Description: Binary data
___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev

Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 10:57:31PM +0400, Denis Bilenko 
 wrote:
> No, it's not that. Attached is the program that checks the error codes
> and also waits for a child using child watcher. Still fails.

Now you changed your program considerably, and I am too lazy/sleepy to
track down further bugs in it.

If I modify your original program in the way I said, to close the extra
fd, call waitpid on the child and destroy the default loop before the next
fork, it runs fine here.

This is the end result, in case my description of my changes wasnt clear:
http://data.plan9.de/testsigchld.c

You could use that as starting point, although my recommendations from my
first reply are still something you should consider seriously :)

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Denis Bilenko
On Tue, May 29, 2012 at 10:08 PM, Marc Lehmann  wrote:
> You also don't call waitpid on your children - can you add error checking
> to all functions you call, to make sure you really do wait for your
> children (especially check fork returns)? Most likely you run out of
> process slots and fork fails, which your code "misinterprets" as a hang.

No, it's not that. Attached is the program that checks the error codes
and also waits for a child using child watcher. Still fails.
/* gcc -DEV_STANDALONE=1 testsigchld.c -o testsigchld 
 *
 * Expected output: infinite sequence of "*."
 *
 * Actual output:
 *   denis@denis-laptop:~/work/libev-cvs$ ./testsigchld 
 *   *.*.*.Alarm clock
 *
 *   (number of iterations is different each time)
 *
 * */
#include "ev.c"


void stop_child(struct ev_loop* loop, ev_child *w, int revents) {
ev_child_stop(loop, w);
}

int CHECK(int retcode) {
if (retcode < 0) {
perror("fail");
_exit(1);
}
return retcode;
}

void subprocess(void) {
int pid;
if (pid = CHECK(fork())) {
struct ev_child child;
ev_child_init(&child, stop_child, pid, 0);
ev_child_start(EV_DEFAULT, &child);
ev_run(EV_DEFAULT, 0);
CHECK(fprintf(stderr, "*"));
}
else {
_exit(0);
}
}


void test_main(void) {
int pid = CHECK(fork());
if (!pid) {
ev_loop_fork(EV_DEFAULT);
subprocess();
_exit(0);
}

alarm(5);

struct ev_child child;
ev_child_init(&child, stop_child, pid, 0);
ev_child_start(EV_DEFAULT, &child);
ev_run(EV_DEFAULT, 0);

alarm(0);
}

int main(int argc, char** argv) {
ev_default_loop(EVBACKEND_SELECT);
while (1) {
test_main();
CHECK(fprintf(stderr, "."));
}
}
___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev

Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 09:57:30PM +0400, Denis Bilenko 
 wrote:
> On Tue, May 29, 2012 at 7:12 PM, Marc Lehmann  wrote:
> > This one - your test program forks after initialising the default loop,
> > without calling ev_default_fork.
> 
> OK, I've fixed the test program to do that and also fixed a fd leak.
> It takes a bit longer to fail now, but it still fails.

You also don't call waitpid on your children - can you add error checking
to all functions you call, to make sure you really do wait for your
children (especially check fork returns)? Most likely you run out of
process slots and fork fails, which your code "misinterprets" as a hang.

> > Note that this only works because the fork isn't done while the default
> > loop exists at the time - if you would fork while the dfeault loop
> > existed, you'd have to work with ev_default_fork and probably stop all
> > watchers you inherited form the parent.
> 
> At the time of fork there no active watchers in the test program.

Well, there are, inside libev. But that's pretty irrelevant, the
documentation doesn't say you can ignore the fork if you don't have any
active watchers.

> To rule out epoll issues, I'm now using EVBACKEND_SELECT explicitly -
> still fails. There no pthreads either, so it has to be something else.

You still have to call ev_default_fork before you can reuse a loop in the
child, there are no exceptions.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Denis Bilenko
On Tue, May 29, 2012 at 7:12 PM, Marc Lehmann  wrote:
> This one - your test program forks after initialising the default loop,
> without calling ev_default_fork.

OK, I've fixed the test program to do that and also fixed a fd leak.
It takes a bit longer to fail now, but it still fails.

> I added an ev_loop_destroy (EV_DEFAULT) at the end of test_main, and got a
> lot longer output, until:
>
> (libev) error creating signal/async pipe: Too many open files

With fd leak fixed, this should no longer happen. I've tried adding
ev_loop_destroy too - it still fails.

> Note that this only works because the fork isn't done while the default
> loop exists at the time - if you would fork while the dfeault loop
> existed, you'd have to work with ev_default_fork and probably stop all
> watchers you inherited form the parent.

At the time of fork there no active watchers in the test program.

> Now, the fork business is very unfortunate, but both epoll/kqueue and
> pthreads have diminished fork into a state where using an event loop in
> both parent ands child has become extreely hard (actually, doing anything
> in the child is hard with pthreads).

To rule out epoll issues, I'm now using EVBACKEND_SELECT explicitly -
still fails. There no pthreads either, so it has to be something else.

Attached is modified version with all the fixes mentioned above.
Please try it as well :) I've also tried using EVFLAG_FORKCHECK but
that did not help either.
/* gcc -DEV_STANDALONE=1 testsigchld.c -o testsigchld 
 *
 * Expected output: infinite sequence of *.*.*...
 *
 * Actual output:
 *   denis@denis-laptop:~/work/libev-cvs$ ./testsigchld 
 *   *.*.*.Alarm clock
 *
 *   (number of iterations is different each time)
 *
 * */
#include "ev.c"


struct ev_child child;

void stop_child(struct ev_loop* loop, ev_child *w, int revents) {
ev_child_stop(loop, w);
}

void stop_io(struct ev_loop* loop, ev_io *w, int revents) {
ev_io_stop(loop, w);
}


void subprocess(void) {
int pid;
if (pid = fork()) {
ev_child_init(&child, stop_child, pid, 0);
ev_child_start(EV_DEFAULT, &child);
ev_run(EV_DEFAULT, 0);
fprintf(stderr, "*");
}
else {
_exit(0);
}
}


void test_main(void) {
int pipefd[2];

while (pipe(pipefd)) perror("pipe");

int pid = fork();
if (!pid) {
ev_loop_fork(EV_DEFAULT);
close(pipefd[0]);
subprocess();
write(pipefd[1], "k", 1);
_exit(0);
}

close(pipefd[1]);
alarm(5);

struct ev_io io;
ev_io_init(&io, stop_io, pipefd[0], 1);
ev_io_start(EV_DEFAULT, &io);
ev_run(EV_DEFAULT, 0);
close(pipefd[0]);

alarm(0);
}

int main(int argc, char** argv) {
ev_default_loop(EVBACKEND_SELECT);
while (1) {
test_main();
fprintf(stderr, ".");
}
}
___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev

Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 05:29:20PM +0200, Gabriel Kerneis 
 wrote:
> I do not understand what you mean here, probably because I never tried to mix
> fork, pthreads and epoll in a single program.  Could you please provide some

Oh, it's quite independent of each other, too, i.e. you don't have to use
them together to get the problems.

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 05:29:20PM +0200, Gabriel Kerneis 
 wrote:
> I do not understand what you mean here, probably because I never tried to mix
> fork, pthreads and epoll in a single program.  Could you please provide some
> more details,

epoll: epoll file descriptors are available in the child, and events are
delivered to both, even for file descriptors that are closed in one of the
processes. since you need an fd to remove fds from an epoll set, you can't
remove that fd (libev uses a heuristic to detetc this case and then recreates
the whole epoll ste, which cna be slow)

kqueue: the kqueue fd is close'd during fork (yes, missing :), or disabled
(yes, fd is still there) and event libraries cannot reallx distinguish
between the cases. this is not as bad as epoll, but still requires special
fork support from event libraries.

pthreads: after pthread_create, the child environment after fork is
as restricted as in a signal handler, i.e. after fork, you cannot use
malloc/printf or anything else thats unsafe in a signal handler.

the result of this is that it's no fun at all to use fork for
multiprocessing.

(not that fork ever was so good for multiprocessing when doing something
more complicated than a preforked server).

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Gabriel Kerneis
Hi Marc,

On Tue, May 29, 2012 at 05:12:22PM +0200, Marc Lehmann wrote:
> Now, the fork business is very unfortunate, but both epoll/kqueue and
> pthreads have diminished fork into a state where using an event loop in
> both parent ands child has become extreely hard (actually, doing anything
> in the child is hard with pthreads).

I do not understand what you mean here, probably because I never tried to mix
fork, pthreads and epoll in a single program.  Could you please provide some
more details, or pointers to the relevant documentation?

Many thanks,
-- 
Gabriel Kerneis

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


Re: SIGCHLD not received

2012-05-29 Thread Marc Lehmann
On Tue, May 29, 2012 at 04:31:05PM +0400, Denis Bilenko 
 wrote:
> I have a weird test case (attached) where SIGCHLD is not being
> received by libev. I don't quite understand if it's a

Well, thanks for a simple-to-try test program :)

> 1) bug in how I use libev

This one - your test program forks after initialising the default loop,
without calling ev_default_fork.

I added an ev_loop_destroy (EV_DEFAULT) at the end of test_main, and got a
lot longer output, until:

(libev) error creating signal/async pipe: Too many open files

Which is probably a different bug in the test program.

Note that this only works because the fork isn't done while the default
loop exists at the time - if you would fork while the dfeault loop
existed, you'd have to work with ev_default_fork and probably stop all
watchers you inherited form the parent.

Now, the fork business is very unfortunate, but both epoll/kqueue and
pthreads have diminished fork into a state where using an event loop in
both parent ands child has become extreely hard (actually, doing anything
in the child is hard with pthreads).

If you plan to design a new application, it would probably be much easier in
the long run if you either:

a) only use libev in your "worker" processes.
b) fork+exec worker processes, and use libev freely in both (avoiding
   epoll if you fork often in the parent).

-- 
The choice of a   Deliantra, the free code+content MORPG
  -==- _GNU_  http://www.deliantra.net
  ==-- _   generation
  ---==---(_)__  __   __  Marc Lehmann
  --==---/ / _ \/ // /\ \/ /  schm...@schmorp.de
  -=/_/_//_/\_,_/ /_/\_\

___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev


SIGCHLD not received

2012-05-29 Thread Denis Bilenko
Hi,

I have a weird test case (attached) where SIGCHLD is not being
received by libev. I don't quite understand if it's a

1) bug in how I use libev
2) bug in libev itself
3) bug in the OS

The work around I found that makes this test pass is to patch libev to
start a timer (active only when there are active child watchers) which
calls childcb periodically.

I tested it against the latest libev from CVS.

uname -a: Linux denis-laptop 3.2.0-23-generic-pae #36-Ubuntu SMP Tue
Apr 10 22:19:09 UTC 2012 i686 athlon i386 GNU/Linux

Cheers,
Denis.
/* gcc -DEV_STANDALONE=1 testsigchld.c -o testsigchld 
 *
 * Expected output: infinite sequence of *.*.*...
 *
 * Actual output:
 *   denis@denis-laptop:~/work/libev-cvs$ ./testsigchld 
 *   *.*.*.Alarm clock
 *
 *   (number of iterations is different each time)
 *
 * */
#include "ev.c"


struct ev_child child;

void stop_child(struct ev_loop* loop, ev_child *w, int revents) {
ev_child_stop(loop, w);
}

void stop_io(struct ev_loop* loop, ev_io *w, int revents) {
ev_io_stop(loop, w);
}


void subprocess(void) {
int pid;
ev_default_loop(0);
if (pid = fork()) {
ev_child_init(&child, stop_child, pid, 0);
ev_child_start(ev_default_loop(0), &child);
ev_run(ev_default_loop(0), 0);
fprintf(stderr, "*");
}
else {
_exit(0);
}
}


void test_main(void) {
int pipefd[2];

while (pipe(pipefd)) perror("pipe");

int pid = fork();
if (!pid) {
close(pipefd[0]);
subprocess();
write(pipefd[1], "k", 1);
_exit(0);
}

close(pipefd[1]);
alarm(5);

struct ev_io io;
ev_io_init(&io, stop_io, pipefd[0], 1);
ev_io_start(ev_default_loop(0), &io);
ev_run(ev_default_loop(0), 0);

alarm(0);
}

int main(int argc, char** argv) {
while (1) {
test_main();
fprintf(stderr, ".");
}
}
___
libev mailing list
libev@lists.schmorp.de
http://lists.schmorp.de/cgi-bin/mailman/listinfo/libev