Hi,

I have question about Polish language in Solr.

There are 2 options: StempelPolishStemFilterFactory or
HunspellStemFilterFactory with polish dictionary. I've made some tests but
the results are not satisfying me. StempelPolishStemFilterFactory is very
fast during indexing but the quality of searches is not exactly that I
expect. In turn HunspellStemFilterFactory is better in searching but
indexing polish text is very slow.

For example indexing 100k documents with StempelPolishStemFilterFactory
takes only 10 min (150 doc/sec), with HunspellStemFilterFactory - 1h 20
min, so it is only 18-20 doc/sec. (server with 8 cores, 24GB RAM, index on
SSD disk).

Is it possible to speed up indexing with hunspell? What should I optimize?

Have you any experience with Hunspell?

I use Solr 4.0.

Best regards
Agnieszka

Reply via email to