[ http://issues.apache.org/jira/browse/NUTCH-258?page=comments#action_12422962 ] Chris A. Mattmann commented on NUTCH-258: -----------------------------------------
Guys, This issue slipped off my radar for a bt, but I'll have some free time this week to work on it. If there are no objections, I will implement what Jerome suggested: that is, throwing a RuntimeException instead of setting a flag in the NutchConfiguration, as I believe that it has the same effect. Thanks, Chris > Once Nutch logs a SEVERE log item, Nutch fails forevermore > ---------------------------------------------------------- > > Key: NUTCH-258 > URL: http://issues.apache.org/jira/browse/NUTCH-258 > Project: Nutch > Issue Type: Bug > Components: fetcher > Affects Versions: 0.8-dev > Environment: All > Reporter: Scott Ganyo > Assigned To: Chris A. Mattmann > Priority: Critical > Attachments: dumbfix.patch, NUTCH-258.Mattmann.060906.patch.txt > > > Once a SEVERE log item is written, Nutch shuts down any fetching forevermore. > This is from the run() method in Fetcher.java: > public void run() { > synchronized (Fetcher.this) {activeThreads++;} // count threads > > try { > UTF8 key = new UTF8(); > CrawlDatum datum = new CrawlDatum(); > > while (true) { > if (LogFormatter.hasLoggedSevere()) // something bad happened > break; // exit > > Notice the last 2 lines. This will prevent Nutch from ever Fetching again > once this is hit as LogFormatter is storing this data as a static. > (Also note that "LogFormatter.hasLoggedSevere()" is also checked in > org.apache.nutch.net.URLFilterChecker and will disable this class as well.) > This must be fixed or Nutch cannot be run as any kind of long-running > service. Furthermore, I believe it is a poor decision to rely on a logging > event to determine the state of the application - this could have any number > of side-effects that would be extremely difficult to track down. (As it has > already for me.) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira