[ https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517184 ]
Stanislaw Osinski commented on LUCENE-966: ------------------------------------------ Thanks for more test cases. I guess the biggest problem here is that the scanner generated by JavaCC doesn't seem to strictly follow the specification (see https://issues.apache.org/jira/browse/LUCENE-966#action_12516893), so I'd need to emulate possible JavaCC "bugs" I'm not aware of at the moment (I'm not an expert on lexical scanner generation either, not yet at least :). I can add some workarounds to the grammar to make the known incompatibility examples work, but this won't guarantee consistency in general. As a side note, it's a shame there's no trace of the version of JavaCC that was used to generate the scanner for the original StandardAnalyzer. I'm also curious if the results of the current JavaCC grammar would be the same with the newest version of the generator (4.0 I guess) -- I'll try to check that. Anyway, I'll take a look at the problem in more depth once again. And in the worst case scenario, we can keep the StandardAnalyzer as it was and add the new one next to it so that people can have a choice (on the other hand, this might be a problem for the quality tests). > A faster JFlex-based replacement for StandardAnalyzer > ----------------------------------------------------- > > Key: LUCENE-966 > URL: https://issues.apache.org/jira/browse/LUCENE-966 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Reporter: Stanislaw Osinski > Fix For: 2.3 > > Attachments: AnalyzerBenchmark.java, jflex-analyzer-patch.txt, > jflex-analyzer-r560135-patch.txt, jflex-analyzer-r561292-patch.txt, > jflex-analyzer-r561693-compatibility.txt > > > JFlex (http://www.jflex.de/) can be used to generate a faster (up to several > times) replacement for StandardAnalyzer. Will add a patch and a simple > benchmark code in a while. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]