[ https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless resolved LUCENE-966. --------------------------------------- Resolution: Fixed Lucene Fields: [New, Patch Available] (was: [New]) OK I committed this! Thank you Stanislaw! I ran a quick perf test on Wikipedia (first 50K docs only) and found the new StandardTokenizer is ~6X faster -- awesome :) I made these small additional changes over the final patch before committing: * I removed StandardAnalyzer.html "grammar doc" generation from build.xml since it was using jjdoc. Stanislaw, is there something in jflex that can generated a BNF description of the grammar as HTML? * I removed the @author tag from StandardTokenizer.java: we are removing all such tags and instead giving credit in CHANGES.txt. * I removed the whitespace-only diffs from common-build.xml & build.xml. * I put back the big comment that describes this tokenizer in StandardTokenizer.java. * Put standard Apache copyright headers in all sources. > A faster JFlex-based replacement for StandardAnalyzer > ----------------------------------------------------- > > Key: LUCENE-966 > URL: https://issues.apache.org/jira/browse/LUCENE-966 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Reporter: Stanislaw Osinski > Fix For: 2.3 > > Attachments: AnalyzerBenchmark.java, jflex-analyzer-patch.txt, > jflex-analyzer-r560135-patch.txt, jflex-analyzer-r561292-patch.txt, > jflex-analyzer-r561693-compatibility.txt, > jflex-analyzer-r562378-patch-nodup.txt, jflex-analyzer-r562378-patch.txt > > > JFlex (http://www.jflex.de/) can be used to generate a faster (up to several > times) replacement for StandardAnalyzer. Will add a patch and a simple > benchmark code in a while. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]