[
https://issues.apache.org/jira/browse/LUCENE-966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12517184
]
Stanislaw Osinski commented on LUCENE-966:
------------------------------------------
Thanks for more test cases. I guess the biggest problem here is that the
scanner generated by JavaCC doesn't seem to strictly follow the specification
(see https://issues.apache.org/jira/browse/LUCENE-966#action_12516893), so I'd
need to emulate possible JavaCC "bugs" I'm not aware of at the moment (I'm not
an expert on lexical scanner generation either, not yet at least :). I can add
some workarounds to the grammar to make the known incompatibility examples
work, but this won't guarantee consistency in general.
As a side note, it's a shame there's no trace of the version of JavaCC that was
used to generate the scanner for the original StandardAnalyzer. I'm also
curious if the results of the current JavaCC grammar would be the same with the
newest version of the generator (4.0 I guess) -- I'll try to check that.
Anyway, I'll take a look at the problem in more depth once again. And in the
worst case scenario, we can keep the StandardAnalyzer as it was and add the new
one next to it so that people can have a choice (on the other hand, this might
be a problem for the quality tests).
> A faster JFlex-based replacement for StandardAnalyzer
> -----------------------------------------------------
>
> Key: LUCENE-966
> URL: https://issues.apache.org/jira/browse/LUCENE-966
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Analysis
> Reporter: Stanislaw Osinski
> Fix For: 2.3
>
> Attachments: AnalyzerBenchmark.java, jflex-analyzer-patch.txt,
> jflex-analyzer-r560135-patch.txt, jflex-analyzer-r561292-patch.txt,
> jflex-analyzer-r561693-compatibility.txt
>
>
> JFlex (http://www.jflex.de/) can be used to generate a faster (up to several
> times) replacement for StandardAnalyzer. Will add a patch and a simple
> benchmark code in a while.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]