[ 
https://issues.apache.org/jira/browse/LUCENE-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Rowe resolved LUCENE-6682.
--------------------------------
       Resolution: Fixed
         Assignee: Steve Rowe
    Fix Version/s: Trunk
                   5.3

Committed to trunk and branch_5x.

Thanks for reporting, Piotr!

> StandardTokenizer performance bug: buffer is unnecessarily copied when 
> maxTokenLength doesn't change
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-6682
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6682
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Steve Rowe
>            Assignee: Steve Rowe
>             Fix For: 5.3, Trunk
>
>
> From Piotr Idzikowski on java-user mailing list 
> [http://markmail.org/message/af26kr7fermt2tfh]:
> {quote}
> I am developing own analyzer based on StandardAnalyzer.
> I realized that tokenizer.setMaxTokenLength is called many times.
> {code:java}
> protected TokenStreamComponents createComponents(final String fieldName,
> final Reader reader) {
>     final StandardTokenizer src = new StandardTokenizer(getVersion(),
> reader);
>     src.setMaxTokenLength(maxTokenLength);
>     TokenStream tok = new StandardFilter(getVersion(), src);
>     tok = new LowerCaseFilter(getVersion(), tok);
>     tok = new StopFilter(getVersion(), tok, stopwords);
>     return new TokenStreamComponents(src, tok) {
>       @Override
>       protected void setReader(final Reader reader) throws IOException {
>         src.setMaxTokenLength(StandardAnalyzer.this.maxTokenLength);
>         super.setReader(reader);
>       }
>     };
>   }
> {code}
> Does it make sense if length stays the same? I see it finally calls this
> one( in StandardTokenizerImpl ):
> {code:java}
> public final void setBufferSize(int numChars) {
>      ZZ_BUFFERSIZE = numChars;
>      char[] newZzBuffer = new char[ZZ_BUFFERSIZE];
>      System.arraycopy(zzBuffer, 0, newZzBuffer, 0,
> Math.min(zzBuffer.length, ZZ_BUFFERSIZE));
>      zzBuffer = newZzBuffer;
>    }
> {code}
> So it just copies old array content into the new one.
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to