Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification.
The "NutchTutorial" page has been changed by AndreRicardo. http://wiki.apache.org/nutch/NutchTutorial?action=diff&rev1=24&rev2=25 -------------------------------------------------- Typically one starts testing one's configuration by crawling at shallow depths, sharply limiting the number of pages fetched at each level (-topN), and watching the output to check that desired pages are fetched and undesirable pages are not. Once one is confident of the configuration, then an appropriate depth for a full crawl is around 10. The number of pages per level (-topN) for a full crawl can be from tens of thousands to millions, depending on your resources. - Once crawling has completed, one can skip to the Searching section below. + Once crawling has completed, one can skip to the [[NutchTutorial#Searching|Searching section]] below. == Step-by-Step or Whole-web Crawling ==

