Hi Ian.
> hi, everyone
>
> I used nutch to crawl about 150 websites, result is quite good.
>
> If I only want to search result in a list of defined domains, how can I
> make it?
>
> Thanks!
>
> Ian

Put below lines in your nutch-site.xml

<property>
  <name>db.ignore.external.links</name>
  <value>true</value>
  <description>If true, outlinks leading from a page to external hosts
  will be ignored. This is an effective way to limit the crawl to include
  only initially injected hosts, without creating complex URLFilters.
  </description>
</property>

-- 
Regards,
Dmitry Lihachev

Reply via email to