Re: retry later

Doug Cutting Tue, 07 Mar 2006 11:11:44 -0800

Richard Braman wrote:

when you get an error while fetching, and you get the
org.apache.nutch.protocol.retrylater because the max retries have been
reached, nutch says it has given up and will retry later, when does that
retry occur?  How would you make a fetchlist of all urls that have
failed?  Is this information maintained somewhere?

Each url in the crawldb has a retry count, the number of times it hasbeen tried without a conclusive result. When the maximum(db.fetch.retry.max) then the page is considered gone. Until then itwill be generated for fetch along with other pages. There is no commandthat generates a fetchlist for only pages whose retry count is greaterthan zero.


Doug

Re: retry later

Reply via email to