> This problem's been reported before. If your OS says that an fd is
> readable via select(), then the read() should not block.
> 
> As you observe though, the read is blocking so your OS is probably not
> telling the truth when it returns from the select().
> 
> The archives have plenty of discussion on this and the simplest
> solution is to put a large-value alarm() handler in qmail-remote. No
> one as yet seems to be able to narrow down which OSes do this and
> under what circumstances.

Mark,

        Thanks for the reply. I only seem to experience the problem with
large mail-outs. One possibility is that because of the way qmail works,
there's a significant chance that we will be making a large number of
simultaneous connections to some servers.

        It's possible that this is causing a connection to be blackholed
somewhere ... that doesn't explain why select/read are failing to agree,
though. Perhaps select thinks the connection is closed, but read doesn't.

        Setting an alarm is a nasty hack in my opinion, but I have to admit
that it's something I considered. A slightly neater solution might be to use
the SO_KEEPALIVE socket option - if it works (and there isn't a good reason
not to use it) that is.

        What would be better is finding out why this happens, of course.

        Thanks,

                Richard

P.S. If anyone is keeping track, Linux 2.2.19, concurrencyremote set to 200

Reply via email to