[
https://issues.apache.org/jira/browse/SOLR-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12791940#action_12791940
]
Uwe Schindler commented on SOLR-1662:
-------------------------------------
+1 Looks good!
> BufferedTokenStream incorrect cloning
> -------------------------------------
>
> Key: SOLR-1662
> URL: https://issues.apache.org/jira/browse/SOLR-1662
> Project: Solr
> Issue Type: Bug
> Components: Schema and Analysis
> Affects Versions: 1.4
> Reporter: Robert Muir
> Assignee: Shalin Shekhar Mangar
> Attachments: SOLR-1662.patch
>
>
> As part of writing tests for SOLR-1657, I rewrote one of the base classes
> (BaseTokenTestCase) to use the new TokenStream API, but also with some
> additional safety.
> {code}
> public static String tsToString(TokenStream in) throws IOException {
> StringBuilder out = new StringBuilder();
> TermAttribute termAtt = (TermAttribute)
> in.addAttribute(TermAttribute.class);
> // extra safety to enforce, that the state is not preserved and also
> // assign bogus values
> in.clearAttributes();
> termAtt.setTermBuffer("bogusTerm");
> while (in.incrementToken()) {
> if (out.length() > 0)
> out.append(' ');
> out.append(termAtt.term());
> in.clearAttributes();
> termAtt.setTermBuffer("bogusTerm");
> }
> in.close();
> return out.toString();
> }
> {code}
> Setting the term text to bogus values helps find bugs in tokenstreams that do
> not clear or clone properly. In this case there is a problem with a
> tokenstream AB_AAB_Stream in TestBufferedTokenStream, it converts A B -> A A
> B but does not clone, so the values get overwritten.
> This can be fixed in two ways:
> * BufferedTokenStream does the cloning
> * subclasses are responsible for the cloning
> The question is which one should it be?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.