Julien Nioche created NUTCH-1806: ------------------------------------ Summary: Delegate processing of URL domains to crawler commons Key: NUTCH-1806 URL: https://issues.apache.org/jira/browse/NUTCH-1806 Project: Nutch Issue Type: Improvement Affects Versions: 1.8 Reporter: Julien Nioche
We have code in src/java/org/apache/nutch/util/domain and a resource file conf/domain-suffixes.xml to handle URL domains. This is used mostly from URLUtil.getDomainName. The resource file is not necessarily up to date and since crawler commons has a similar functionality we should use it instead of having to maintain our own resources. -- This message was sent by Atlassian JIRA (v6.2#6252)