[ 
https://issues.apache.org/jira/browse/NUTCH-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14319992#comment-14319992
 ] 

Julien Nioche commented on NUTCH-1942:
--------------------------------------

See [https://code.google.com/p/crawler-commons/]. Lewis is a committer too. The 
CC library is alredy used within Nutch for handling robots. It is used by other 
crawlers as well.

> Remove TopLevelDomain 
> ----------------------
>
>                 Key: NUTCH-1942
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1942
>             Project: Nutch
>          Issue Type: Task
>            Reporter: Julien Nioche
>            Priority: Minor
>              Labels: newbie
>             Fix For: 1.11
>
>
> We should leverage the domain related utilities from crawler-commons instead 
> of duplicating them in the `org.apache.nutch.util.domain` package. For 
> instance we could deprecate TopLevelDomain and call the corresponding class 
> in CC instead. The resources in CC are more up-to-date and it is less code to 
> maintain.
> This would be a good task for someone willing to get to know the Nutch 
> codebase better and impress us all with the extent of his/her skills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to