thanks for the tip about filtering

-----------------------------------------------------
Subscribe to the Nimble Books Mailing List  http://eepurl.com/czS- for
monthly updates



On Fri, Sep 30, 2011 at 11:00, lewis john mcgibbney <
[email protected]> wrote:

> What is type of filtering is going on in your configuration?
>
> It might be best to readdb incrementally on smaller test fetches to make
> sure your fetching everything you want to.
>
> On Fri, Sep 30, 2011 at 2:23 PM, Fred Zimmerman <[email protected]>
> wrote:
>
> > What does this mean? Why is db_unfetched so high?
> >
> > I want to know how I can be confident that the crawler has fetched all
> the
> > pages in the target site.
> >
> > CrawlDb statistics start: crawl-20110930124111/crawldb
> > Statistics for CrawlDb: crawl-20110930124111/crawldb
> > TOTAL urls:     1237
> > retry 0:        1236
> > retry 1:        1
> > min score:      0.0
> > avg score:      0.005751819
> > max score:      1.0
> > status 1 (db_unfetched):        1040
> > status 2 (db_fetched):  179
> > status 3 (db_gone):     15
> > status 5 (db_redir_perm):       3
> > CrawlDb statistics: done
> >
>
>
>
> --
> *Lewis*
>

Reply via email to