RemoveDuplicatesTokenFilterFactory can not remove the duplicated term
---------------------------------------------------------------------

                 Key: SOLR-2800
                 URL: https://issues.apache.org/jira/browse/SOLR-2800
             Project: Solr
          Issue Type: Bug
          Components: Schema and Analysis
    Affects Versions: 3.4
         Environment: Windows
            Reporter: Han Hui Wen 
             Fix For: 3.5


Using RemoveDuplicatesTokenFilterFactory can not remove the duplicated term.

in 
http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_3_4/solr/core/src/java/org/apache/solr/analysis/RemoveDuplicatesTokenFilter.java?view=markup

@Override
53      public boolean incrementToken() throws IOException {
54      while (input.incrementToken()) {
55      final char term[] = termAttribute.buffer();
56      final int length = termAttribute.length();
57      final int posIncrement = posIncAttribute.getPositionIncrement();
58      
59      if (posIncrement > 0) {
60      previous.clear();
61      }
62      
63      boolean duplicate = (posIncrement == 0 && previous.contains(term, 0, 
length));
64      
65      // clone the term, and add to the set of seen terms.
66      char saved[] = new char[length];
67      System.arraycopy(term, 0, saved, 0, length);
68      previous.add(saved);
69      
70      if (!duplicate) {
71      return true;
72      }
73      }
74      return false;
75      }



it should be like following:
@Override
public boolean incrementToken() throws IOException {
        while (input.incrementToken()) {
                final char term[] = termAttribute.buffer();
                final int length = termAttribute.length();
                final int posIncrement = posIncAttribute.getPositionIncrement();

                if (posIncrement > 0) {
                        previous.clear();
                }

                boolean duplicate = (posIncrement == 0 && 
previous.contains(term, 0, length));
                 
                if(duplicate )
                {
                  return false;
                }
                else
                {
                        // clone the term, and add to the set of seen terms.
                        char saved[] = new char[length];
                        System.arraycopy(term, 0, saved, 0, length);
                        previous.add(saved);
                }
        }
        return true;
}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to