On Wed, 28 Aug 2002, Simon Marlow wrote:

> > I have a program that is suffering a select() failure.  It prints:
> > 
> > 9
> > select: Bad file descriptor
> > Fed: fatal error: select failed
> > 
> > From looking at some RTS sources, the 9 apparently represents 
> > errno EBADF 
> > (bad file descriptor).  Does that mean that my program is 
> > somehow closing
> > one or more file descriptors between when the select() starts 
> > and when it
> > finishes?
> 
> This can arise if you do one of the operations threadWaitRead or
> threadWaitWrite on a file descriptor that is closed.  Sigbjorn recently
> modified the select() code so that it wouldn't fail in such an ugly way;
> now it just wakes up all the threads that were doing select() in the
> hope that the one with a bad file descriptor in its hands will discover
> it some other way (the real problem is that select() doesn't tell us
> which file desriptors were bad/closed when it fails).
> 
> Anyway, you should avoid this scenario if at all possible, since the
> current fall-back method of waking up all the threads could be quite
> expensive.

I use `threadWaitRead` in one place and `threadWaitWrite` nowhere.  I
added a check before the use of `threadWaitRead` that I've not closed the
file descriptor.  The check never detects a closed file descriptor, yet I
still get the select() failure.

I'm assuming that the RTS's select() is done on the sets of file
descriptors involved in current `threadWaitRead` and `threadWaitWrite`
calls.  Is that true?  Are there other uses of select() in the RTS?

> > Unfortunately, select() (and hence the GHC RTS) doesn't 
> > identify the bad
> > descriptor(s).  Here's where I suspect my program may be 
> > going awry.  The
> > main process creates a pipe.  The process then forks.  The 
> > parent closes
> > the pipe's read descriptor immediately.  The child soon goes 
> > to read from
> > the pipe, using threadWaitRead followed by fdRead.  The child process
> > suffers the select failure shown above.
> 
> So.. I take it the child shouldn't really be reading from a closed file
> descriptor?

The file descriptor is the read end of a pipe used to send data from the
parent to the child.  The parent closes it because it will never use it,
but only after the parent forks.  So the child's copy of the file
descriptor should still be open, n'est-ce pas?

> > By the way, why shouldn't such a "fatal error" in the RTS raise an
> > exception that I can catch in my program?  I could then 
> > determine at least 
> > which thread in which process suffered the error.
> 
> A 'fatal error' in the RTS usually means some kind of internal error
> which it isn't possible to recover from.  Hence the term "fatal" :-)
> I've changed the wording in my copy to make it more clear that these
> errors are internal and normally indicate a bug.

Is recovering from this error really not possible, as opposed to not
likely to be useful?

Dean

_______________________________________________
Glasgow-haskell-bugs mailing list
[EMAIL PROTECTED]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs

Reply via email to