I assume you've already read
http://wiki.apache.org/nutch/OptimizingCrawls<%20http://wiki.apache.org/nutch/OptimizingCrawls>
and tried different values for :

<property>
  <name>fetcher.server.delay</name>
  <value>5.0</value>
  <description>The number of seconds the fetcher will delay between
   successive requests to the same server.</description>
</property>

<property>
  <name>fetcher.server.min.delay</name>
  <value>0.0</value>
  <description>The minimum number of seconds the fetcher will delay between
  successive requests to the same server. This value is applicable ONLY
  if fetcher.threads.per.host is greater than 1 (i.e. the host blocking
  is turned off).</description>
</property>


On 25 May 2011 12:52, webdev1977 <[email protected]> wrote:

> Any ideas on how (even if it requires code changes) to speed up the
> mapreduce
> portion for a vertical crawl with a very (three right now) small number of
> sites?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Going-Beyond-the-Prototype-tp2923289p2984011.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to