> I've been running qmail on a number of platforms quite happily for a
> while - until now I've had no problems at all. However, I am now
> experiencing a problem with qmail-remote hanging.
> The problem I see is with qmail-remote failing to terminate when a
> connection times-out. If left alone, the number of "stuck" processes will
> slowly climb, after about a month I had about 25 such processes. The network
> connections remain in the "ESTABLISHED" state.
>
> Looking at the process list right now, I have one stuck:
>
> # ps -ef | grep qmail-remote
> qmailr 12278 662 0 13:13 ? 00:00:00 qmail-remote
> xxxxxxxxxx.co.uk xx
> qmailr 19876 662 0 16:09 ? 00:00:00 qmail-remote xxxxxxxxxx.com
> xxxx
> root 19912 19489 0 16:10 pts/0 00:00:00 grep qmail-remote
>
> # strace -p 12278
> read(3, <unfinished ...>
>
> ... all socket read()s in qmail-remote should be protected by a
> select and therefore should not block as this one is doing now. After
> recompiling with debugging and symbols, I get ...
Exactly.
This problem's been reported before. If your OS says that an fd is
readable via select(), then the read() should not block.
As you observe though, the read is blocking so your OS is probably not
telling the truth when it returns from the select().
The archives have plenty of discussion on this and the simplest
solution is to put a large-value alarm() handler in qmail-remote. No
one as yet seems to be able to narrow down which OSes do this and
under what circumstances.
Regards.