Check these properties:

560     <property>
561     <name>generate.max.count</name>
562     <value>-1</value>
563     <description>The maximum number of urls in a single
564     fetchlist. -1 if unlimited. The urls are counted according
565     to the value of the parameter generator.count.mode.
566     </description>
567     </property>
569     <property>
570     <name>generate.count.mode</name>
571     <value>host</value>
572     <description>Determines how the URLs are counted for 
573     Default value is 'host' but can be 'domain'. Note that we do not count
574     per IP in the new version of the Generator.
575     </description>
576     </property>

On Wednesday 11 April 2012 17:05:04 Anders Rask wrote:
> Hi!
> I would like to be able to limit how many pages Nutch crawls from a
> specific site, either by specifying the total number of pages to crawl from
> one site or by specifying a depth of how many links that should be followed
> from the initial seed.
> I've been working with Nutch for some time now but haven't been able to
> figure out how this can be achieved. So my question is: Is there any way to
> configure Nutch for this, and if not are there any plans to implement this
> functionality?
> Best regards,
> --Anders Rask

Markus Jelsma - CTO - Openindex

Reply via email to