more performance improvements for snowball
------------------------------------------

                 Key: LUCENE-2201
                 URL: https://issues.apache.org/jira/browse/LUCENE-2201
             Project: Lucene - Java
          Issue Type: Improvement
          Components: contrib/analyzers
            Reporter: Robert Muir
            Priority: Minor
         Attachments: LUCENE-2201.patch

i took a more serious look at snowball after LUCENE-2194.

This gives greatly improved performance, but note it has some minor breaks to 
snowball internals:
* Among.s becomes a char[] instead of a string
* SnowballProgram.current becomes a char[] instead of a StringBuilder
* SnowballProgram.eq_s(int, String) becomes eq_s(int, CharSequence), so that 
eq_v(StringBuilder) doesnt need to create an extra string.
* same as the above with eq_s_b and eq_v_b
* replace_s(int, int, String) becomes replace_s(int, int, CharSequence), so 
that StringBuilder-based slice and insertion methods don't need to create an 
extra string.

all of these "breaks" imho are only theoretical, the problem is just that 
pretty much everything is public or protected in the snowball internals.

the performance improvement here depends heavily upon the snowball language in 
use, but its way more significant than LUCENE-2194.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to