Hi, 

 

I've been testing some CJK tokenizers and I manage to get acceptable
results using:

 

<fieldType name="text_cjk" class="solr.TextField"
positionIncrementGap="100">

                     <analyzer>

                       <tokenizer
class="solr.StandardTokenizerFactory"/>

                       <!-- normalize width before bigram, as e.g.
half-width dakuten combine  -->

                       <filter class="solr.CJKWidthFilterFactory"/>

                       <!-- for any non-CJK -->

                       <filter class="solr.LowerCaseFilterFactory"/>

                       <filter class="solr.CJKBigramFilterFactory"/>

                     </analyzer>

    </fieldType>

 

The tests have been done using SOLR 3.5.0 on TomCat7.

 

Fo make some further testing I installed SOLR 3.5.0 using default Jetty
server. 

When tried to start SOLR using the same schema I get:

 

SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.CJKBigramFilterFactory'

SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.CJKWidthFilterFactory'

 

Should these classes com on v. 3.5.0 by default? 

Do I need to install anything or copy any lib?

 

Thank you all.

Frederico

 

Reply via email to