Hi Nutch Guys, I used to show the crawldb stats. Now I want to show which urls are db_gone (it means an error 404 - or anything else) how may I showing the db_gone urls?
bin/nutch readdb crawl/crawldb -stats CrawlDb statistics start: crawl/crawldb Statistics for CrawlDb: crawl/crawldb TOTAL urls: 2157 retry 0: 2154 retry 5: 3 min score: 0.0 avg score: 0.018363468 max score: 3.01 status 1 (db_unfetched): 1971 status 2 (db_fetched): 158 status 3 (db_gone): 13 status 4 (db_redir_temp): 1 status 5 (db_redir_perm): 14 CrawlDb statistics: done thanks, Mario -- Mario Schröder | http://www.finanz-checks.de Office: +49 361 2152062 Phone: +49 34464 62301 Cell: +49 163 27 09 807 http://www.xing.com/go/invite/6035007.9c143c
