[ https://issues.apache.org/jira/browse/NUTCH-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319992#comment-14319992 ]
Julien Nioche commented on NUTCH-1942: -------------------------------------- See [https://code.google.com/p/crawler-commons/]. Lewis is a committer too. The CC library is alredy used within Nutch for handling robots. It is used by other crawlers as well. > Remove TopLevelDomain > ---------------------- > > Key: NUTCH-1942 > URL: https://issues.apache.org/jira/browse/NUTCH-1942 > Project: Nutch > Issue Type: Task > Reporter: Julien Nioche > Priority: Minor > Labels: newbie > Fix For: 1.11 > > > We should leverage the domain related utilities from crawler-commons instead > of duplicating them in the `org.apache.nutch.util.domain` package. For > instance we could deprecate TopLevelDomain and call the corresponding class > in CC instead. The resources in CC are more up-to-date and it is less code to > maintain. > This would be a good task for someone willing to get to know the Nutch > codebase better and impress us all with the extent of his/her skills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)