RemoveDuplicatesTokenFilter doest have expected behaviour
---------------------------------------------------------
Key: SOLR-1869
URL: https://issues.apache.org/jira/browse/SOLR-1869
Project: Solr
Issue Type: Bug
Components: Schema and Analysis
Reporter: Joe Calderon
Priority: Minor
Attachments: SOLR-1869.patch
the RemoveDuplicatesTokenFilter seems broken as it initializes its map and
attributes at the class level and not within its constructor
in addition i would think the expected behaviour would be to remove identical
terms with the same offset positions, instead it looks like it removes
duplicates based on position increment which wont work when using it after
something like the edgengram filter. when i posted this to the mailing list
even erik hatcher seemed to think thats what this filter was supposed to do...
attaching a patch that has the expected behaviour and initializes variables in
constructor
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.