On Sat, Sep 27, 2014 at 9:48 AM, Niko Tyni <nt...@debian.org> wrote:

> tag 721421 patch
> thanks
>
> On Fri, Sep 26, 2014 at 11:12:06PM +0300, Niko Tyni wrote:
>
> > The problem apparently happens when the timeout in the select loop
> > (one second) triggers before execvp() has been called.
> >
> > I can reproduce a similar "race" on my x86_64 machine by inserting a
> > sleep(1) call right before the execvp() call.
> >
> > I still haven't got to the bottom of it, but it looks like the gdb
> > output is lost somewhere with select() timeouting (and returning zero)
> > on subsequent calls too even though gdb has happily written to the pipe.
>
> Further investigation with strace shows that the fd_set passed into
> select() becomes empty if execvp() happens after the first select()
> call. I was able to reproduce this with gdb replaced by a trivial program
> that just prints to stdout (which greatly helped debugging.)
>
> So I suppose the execvp() call somehow invalidates the fd set?
>
> I haven't found an explanation for this observed behaviour. The closest
> thing I was able to find was this in the select_tut(2) Linux manual page
> (on Debian sid if that matters):
>
>        11. Since  select()  modifies  its  file descriptor sets, if the
>            call is being used in a loop, then the sets must be
>            reinitialized before each call.
>
> Reinitializing the set in the loop fixes it and seems to be the correct
> thing to do anyway. Patch attached, this makes it work for me on both
> mips and amd64.
>

Right, that is definitely a bug. Haven't used select in such a long time
that I had looked over that insanity.

Leon

Reply via email to