[ https://issues.apache.org/jira/browse/NUTCH-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018413#comment-15018413 ]
Hudson commented on NUTCH-2069: ------------------------------- SUCCESS: Integrated in Nutch-trunk #3313 (See [https://builds.apache.org/job/Nutch-trunk/3313/]) NUTCH-2069 (jnioche: [http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1715386]) * trunk/CHANGES.txt * trunk/conf/nutch-default.xml * trunk/src/java/org/apache/nutch/fetcher/FetcherThread.java * trunk/src/java/org/apache/nutch/parse/ParseOutputFormat.java > Ignore external links based on domain > ------------------------------------- > > Key: NUTCH-2069 > URL: https://issues.apache.org/jira/browse/NUTCH-2069 > Project: Nutch > Issue Type: Improvement > Components: fetcher, parser > Affects Versions: 1.10 > Reporter: Julien Nioche > Fix For: 1.11 > > Attachments: NUTCH-2069.patch, NUTCH-2069.v2.patch > > > We currently have `db.ignore.external.links` which is a nice way of > restricting the crawl based on the hostname. This adds a new parameter > 'db.ignore.external.links.domain' to do the same based on the domain. -- This message was sent by Atlassian JIRA (v6.3.4#6332)