RemoveDuplicatesTokenFilterFactory can not remove the duplicated term
---------------------------------------------------------------------
Key: SOLR-2800
URL: https://issues.apache.org/jira/browse/SOLR-2800
Project: Solr
Issue Type: Bug
Components: Schema and Analysis
Affects Versions: 3.4
Environment: Windows
Reporter: Han Hui Wen
Fix For: 3.5
Using RemoveDuplicatesTokenFilterFactory can not remove the duplicated term.
in
http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_3_4/solr/core/src/java/org/apache/solr/analysis/RemoveDuplicatesTokenFilter.java?view=markup
@Override
53 public boolean incrementToken() throws IOException {
54 while (input.incrementToken()) {
55 final char term[] = termAttribute.buffer();
56 final int length = termAttribute.length();
57 final int posIncrement = posIncAttribute.getPositionIncrement();
58
59 if (posIncrement > 0) {
60 previous.clear();
61 }
62
63 boolean duplicate = (posIncrement == 0 && previous.contains(term, 0,
length));
64
65 // clone the term, and add to the set of seen terms.
66 char saved[] = new char[length];
67 System.arraycopy(term, 0, saved, 0, length);
68 previous.add(saved);
69
70 if (!duplicate) {
71 return true;
72 }
73 }
74 return false;
75 }
it should be like following:
@Override
public boolean incrementToken() throws IOException {
while (input.incrementToken()) {
final char term[] = termAttribute.buffer();
final int length = termAttribute.length();
final int posIncrement = posIncAttribute.getPositionIncrement();
if (posIncrement > 0) {
previous.clear();
}
boolean duplicate = (posIncrement == 0 &&
previous.contains(term, 0, length));
if(duplicate )
{
return false;
}
else
{
// clone the term, and add to the set of seen terms.
char saved[] = new char[length];
System.arraycopy(term, 0, saved, 0, length);
previous.add(saved);
}
}
return true;
}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]