Unfortunately, StandardAnalyzer is slow. StandardAnalyzer is really limited by JavaCC speed. You cannot shave much more performance out of the grammar as it is already about as simple as it gets.
JavaCC is slow indeed. We used it for a while for Carrot2, but then (3 years ago :) switched to JFlex, which for roughly the same grammar would sometimes be up to 10x (!) faster. You can have a look at our JFlex specification at: http://carrot2.svn.sourceforge.net/viewvc/carrot2/trunk/carrot2/components/carrot2-util-tokenizer/src/org/carrot2/util/tokenizer/parser/jflex/JFlexWordBasedParserImpl.jflex?view=markup This one seems more complex than the StandardAnalyzer's but it's much faster anyway. If anyone is interested, I could prepare a JFlex based Analyzer equivalent (to the extent possible) to current StandardAnalyzer, which might offer nice indexing and highlighting speed-ups. Best, Staszek -- Stanislaw Osinski, [EMAIL PROTECTED] http://www.carrot-search.com