Hi Ian. > hi, everyone > > I used nutch to crawl about 150 websites, result is quite good. > > If I only want to search result in a list of defined domains, how can I > make it? > > Thanks! > > Ian
Put below lines in your nutch-site.xml <property> <name>db.ignore.external.links</name> <value>true</value> <description>If true, outlinks leading from a page to external hosts will be ignored. This is an effective way to limit the crawl to include only initially injected hosts, without creating complex URLFilters. </description> </property> -- Regards, Dmitry Lihachev
