Crawldb update to total counts per status -----------------------------------------
Key: NUTCH-1071 URL: https://issues.apache.org/jira/browse/NUTCH-1071 Project: Nutch Issue Type: Improvement Affects Versions: 1.4 Reporter: Julien Nioche Assignee: Julien Nioche Priority: Trivial Fix For: 1.4 The reduce phase of the crawldb update outputs all the entries that will be found in the updated crawldb. We can use the counters to summarise the number of URLs per status, which is a bit like the readdb -stats functionality except that it does not require an additional step. This is a useful way of monitoring the progress of a crawl using the Hadoop JobTracker UI. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira