Phillipe,
In my opinion, there is a problem with timeout() in the LWP library. I do
not fully understand the problem. However, in some circumstances, the
timeout() does not do what you think. In my experience, I think this
happens when the other server responds, but slowly. That is, you make a
connection within the timeout parameter, but the server is just trickling
data to you very very slowly, you will not timeout, but hang.
I find this to be a very difficult problem.
In my application, I am doing meta-searching, pulling search results from
a partner website. They have made available a special url for me, and I
hit them for search results, which I then return to my visitors. This
usually works fine, but if they are slow, my site has many serious
problems. What I would like to do is "timeout" and just not return
results from this partner, if they are slow. But I have not really been
able to achieve this.
I also have the same problem you have, in a separate web crawling
application that wants to download 1000s of URLs. It just gets
hung, and I can do nothing but kill it. This is not predictable
or repeatable in any way, but I suspect it is the same problem --
timeout only works on the _connection_, not on slow data loading.
--Jimbo