thanks for the tip about filtering ----------------------------------------------------- Subscribe to the Nimble Books Mailing List http://eepurl.com/czS- for monthly updates
On Fri, Sep 30, 2011 at 11:00, lewis john mcgibbney < [email protected]> wrote: > What is type of filtering is going on in your configuration? > > It might be best to readdb incrementally on smaller test fetches to make > sure your fetching everything you want to. > > On Fri, Sep 30, 2011 at 2:23 PM, Fred Zimmerman <[email protected]> > wrote: > > > What does this mean? Why is db_unfetched so high? > > > > I want to know how I can be confident that the crawler has fetched all > the > > pages in the target site. > > > > CrawlDb statistics start: crawl-20110930124111/crawldb > > Statistics for CrawlDb: crawl-20110930124111/crawldb > > TOTAL urls: 1237 > > retry 0: 1236 > > retry 1: 1 > > min score: 0.0 > > avg score: 0.005751819 > > max score: 1.0 > > status 1 (db_unfetched): 1040 > > status 2 (db_fetched): 179 > > status 3 (db_gone): 15 > > status 5 (db_redir_perm): 3 > > CrawlDb statistics: done > > > > > > -- > *Lewis* >

