Lewis John McGibbney created NUTCH-2207: -------------------------------------------
Summary: Remove class duplication and smarten-up scoring-similarity plugin Key: NUTCH-2207 URL: https://issues.apache.org/jira/browse/NUTCH-2207 Project: Nutch Issue Type: Improvement Components: plugin, scoring Affects Versions: 1.11 Reporter: Lewis John McGibbney Assignee: Lewis John McGibbney Fix For: 1.12 Right now it appears that DocumentVector.java is duplicated, there is also no license header on [ScoringFilterModel.java|https://github.com/apache/nutch/blob/trunk/src/plugin/scoring-similarity/src/java/org/apache/nutch/scoring/similarity/ScoringFilterModel.java]. I think I've also spotted a number of places that imports are not being used. Finally, Javadoc is virtually non-existent for the scoring-similarity plugin at all. It would help to augment some documentation. It would be very helpful if the [SimilairittScoringFilter wiki page|https://wiki.apache.org/nutch/SimilarityScoringFilter] was cited. We could also do with visiting the wiki page ensuring that all references are present. CC [~sujenshah] -- This message was sent by Atlassian JIRA (v6.3.4#6332)