Kim Gybels <kgyb...@infogroep.be> writes:

> The poll provided in compat/poll.c is not interrupted by receiving
> SIGCHLD. Use a timeout for cleaning up dead children in a timely manner.

I think you identified the problem and diagnosed it correctly, but I
find that the change proposed here introduces a severe layering
violation.  The code is still calling what is called poll(), which
should not have such a broken semantics.

The ideal solution would be to fix the emulation so that it also
properly works for reaping a dead child process, but if that is not
possible, another solution that does not break the API layering
would probably be to introduce our own version of something similar
to poll() that helps various platforms that cannot implement the
real poll() faithfully for whatever reason.  Such an xpoll() API
function we introduce (and implement in compat/poll.c) may take, in
addition to the usual parameters to reall poll(), the value of
live_children we have at this call site.  With that

 - On platforms whose poll() does work correctly for culling dead
   children will just ignore the live_children paramater in its
   implementation of xpoll()

 - On other platforms, it will shorten the timeout depending on the
   need to cull dead children, just like your patch did.

Thanks.


>
> Signed-off-by: Kim Gybels <kgyb...@infogroep.be>
> ---
>  daemon.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/daemon.c b/daemon.c
> index fe833ea7de..6dc95c1b2f 100644
> --- a/daemon.c
> +++ b/daemon.c
> @@ -1147,6 +1147,7 @@ static int service_loop(struct socketlist *socklist)
>  {
>       struct pollfd *pfd;
>       int i;
> +     int poll_timeout = -1;
>  
>       pfd = xcalloc(socklist->nr, sizeof(struct pollfd));
>  
> @@ -1161,8 +1162,13 @@ static int service_loop(struct socketlist *socklist)
>               int i;
>  
>               check_dead_children();
> -
> -             if (poll(pfd, socklist->nr, -1) < 0) {
> +#ifdef NO_POLL
> +             poll_timeout = live_children ? 100 : -1;
> +#endif
> +             int ret = poll(pfd, socklist->nr, poll_timeout);
> +             if  (ret == 0) {
> +                     continue;
> +             } else if (ret < 0) {
>                       if (errno != EINTR) {
>                               logerror("Poll failed, resuming: %s",
>                                     strerror(errno));

Reply via email to