Lewis John McGibbney created NUTCH-3041: -------------------------------------------
Summary: Address confusing logging in o.a.n.net.URLExemptionFilters Key: NUTCH-3041 URL: https://issues.apache.org/jira/browse/NUTCH-3041 Project: Nutch Issue Type: Task Components: net Affects Versions: 1.19, 1.20 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 1.21 URLExemptionFilter impementations are used to allow exemptions to external domain resources by overriding the {{db.ignore.external.links}} configuration setting. This is useful when the crawl is focused to a domain but resources like images are hosted on CDN. Currently [URLExemptionFilters|[https://github.com/apache/nutch/blob/271f92e11c39b7a3583cfcd8d664262cfac59674/src/java/org/apache/nutch/net/URLExemptionFilters.java#L47-L48]] provides some confusing INFO-level logging {quote}INFO o.a.n.n.URLExemptionFilters [LocalJobRunner Map Task Executor #0] Found 0 extensions at point:'org.apache.nutch.net.URLExemptionFilter' {quote} I find this confusing. It would be better to log *only* if an URLExemptionFilter implementation actually exists for a given URL. I will provide a patch for this. -- This message was sent by Atlassian Jira (v8.20.10#820010)