Doug Cutting wrote:
Perhaps we could enhance the logic of the loop at Fetcher.java:
320. Currently this exits the fetcher when all threads exceed a
timeout. Instead it could kill any thread that exceeds the
timeout, and restart a new thread to replace it. So instead of
just keeping a count of fetcher threads, we could maintain a table
of all running fetcher threads, each with a lastRequestStart time,
rather than a global lastRequestStart. Then, in this loop, we can
check to see if any thread has exceeded the maximum timeout, and,
if it has, kill it and start a new thread. When no urls remain,
threads will exit and remove themselves from the set of threads,
so the loop can exit as it does now, when there are no more
running fetcher threads. Does this make sense? It would prevent
all sorts thread hangs, not just in regexes.
+1, sounds like a good solution to this.
+1 a much better solution than my suggestion!
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general