Try changing the value of this parameter in nutch-site.xml

<property>
  <name>db.max.outlinks.per.page</name>
  <value>100</value>
  <description>The maximum number of outlinks that we'll process for a page.
  If this value is nonnegative (>=0), at most db.max.outlinks.per.page
outlinks
  will be processed for a page; otherwise, all outlinks will be processed.
  </description>
</property>


Julien

On 31 January 2012 02:56, mina <[email protected]> wrote:

> i crawl a site with nutch 1.4. but nutch dosen't crawl all links in this
> site. the language of this site is not English. for example nutch dosen't
> crawl this link:
>
>
> http://www.irna.ir/News/30786427/سوء-استفاده-از-نام-كمیته-امداد-برای-جمع-آوری-رای-در-مناطق-محروم/سياسي/
>
> what can i solve this problem? what config i should do?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/error-in-crawl-all-link-in-no-English-language-sites-tp3702014p3702014.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to