[ https://issues.apache.org/jira/browse/NUTCH-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Julien Nioche resolved NUTCH-2069. ---------------------------------- Resolution: Fixed Trunk committed revision 1715386. Thanks everyone for comments and reviews > Ignore external links based on domain > ------------------------------------- > > Key: NUTCH-2069 > URL: https://issues.apache.org/jira/browse/NUTCH-2069 > Project: Nutch > Issue Type: Improvement > Components: fetcher, parser > Affects Versions: 1.10 > Reporter: Julien Nioche > Fix For: 1.11 > > Attachments: NUTCH-2069.patch, NUTCH-2069.v2.patch > > > We currently have `db.ignore.external.links` which is a nice way of > restricting the crawl based on the hostname. This adds a new parameter > 'db.ignore.external.links.domain' to do the same based on the domain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)