[
https://issues.apache.org/jira/browse/SOLR-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13117312#comment-13117312
]
Steven Rowe commented on SOLR-2800:
-----------------------------------
Can you provide a test case? Your version seems more efficient, since
duplicates are not placed in the CharArraySet, but the functionality looks the
same to me as the original. Your test case should demonstrate what is failing
without your changes, and that your changes fix the problem.
Please provide code/tests in the form of a patch - it's much easier to see what
you have changed/added. Also, it's very important when you provide code that
you attach a patch to the issue and click where it says "Grant license to ASF
for inclusion in ASF works (as per the Apache License ยง5)" - unless you do
this, we can't use your code.
> RemoveDuplicatesTokenFilterFactory can not remove the duplicated term
> ---------------------------------------------------------------------
>
> Key: SOLR-2800
> URL: https://issues.apache.org/jira/browse/SOLR-2800
> Project: Solr
> Issue Type: Bug
> Components: Schema and Analysis
> Affects Versions: 3.4
> Environment: Windows
> Reporter: Han Hui Wen
> Labels: RemoveDuplicatesTokenFilterFactory, Solr
> Fix For: 3.5
>
>
> Using RemoveDuplicatesTokenFilterFactory can not remove the duplicated term.
> in
> http://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_3_4/solr/core/src/java/org/apache/solr/analysis/RemoveDuplicatesTokenFilter.java?view=markup
> @Override
> 53 public boolean incrementToken() throws IOException {
> 54 while (input.incrementToken()) {
> 55 final char term[] = termAttribute.buffer();
> 56 final int length = termAttribute.length();
> 57 final int posIncrement = posIncAttribute.getPositionIncrement();
> 58
> 59 if (posIncrement > 0) {
> 60 previous.clear();
> 61 }
> 62
> 63 boolean duplicate = (posIncrement == 0 && previous.contains(term, 0,
> length));
> 64
> 65 // clone the term, and add to the set of seen terms.
> 66 char saved[] = new char[length];
> 67 System.arraycopy(term, 0, saved, 0, length);
> 68 previous.add(saved);
> 69
> 70 if (!duplicate) {
> 71 return true;
> 72 }
> 73 }
> 74 return false;
> 75 }
> it should be like following:
> @Override
> public boolean incrementToken() throws IOException {
> while (input.incrementToken()) {
> final char term[] = termAttribute.buffer();
> final int length = termAttribute.length();
> final int posIncrement = posIncAttribute.getPositionIncrement();
> if (posIncrement > 0) {
> previous.clear();
> }
> boolean duplicate = (posIncrement == 0 &&
> previous.contains(term, 0, length));
>
> if(duplicate )
> {
> return false;
> }
> else
> {
> // clone the term, and add to the set of seen terms.
> char saved[] = new char[length];
> System.arraycopy(term, 0, saved, 0, length);
> previous.add(saved);
> }
> }
> return true;
> }
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]