This is real and not just for very short docs. The reflection overhead
is pretty expensive I think.
here are some stats from the hamshari corpus (i have been trec testing
persian just to ensure everything is ok)

SimpleAnalyzer: (has reusableTokenStream)
Total time: 47816 ms
Unique tokens: 441660

PersianAnalyzer (no reuse):
Total time: 53928 ms
Unique tokens: 438286

PersianAnalyzer (with reusableTokenStream)
Total time: 47704 ms
Unique tokens: 438286

On Mon, Aug 10, 2009 at 10:35 AM, Mark Miller<markrmil...@gmail.com> wrote:
> Discussion on speed of new TokenStream API in Solr.
>
> see:
> http://search.lucidimagination.com/search/document/d0040ebe6addad4b/indexing_slowdown_with_latest_lucene_udpate
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org
>
>



-- 
Robert Muir
rcm...@gmail.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to