Hello,

 

I try to use chinese language with my index.

 

My definition is:

<field name="tizh" type="text_zh" multiValued="true" indexed="true"
stored="true" termVectors="true" termPositions="true" termOffsets="true"/>

 

    <!-- Simplified chinese -->

    <!-- BRUNO -->

    <fieldType name="text_zh" class="solr.TextField"
positionIncrementGap="100">

      <analyzer>

       <tokenizer class="solr.HMMChineseTokenizerFactory"/>

       <filter class="solr.CJKWidthFilterFactory"/>

       <filter class="solr.StopFilterFactory"

          words="org/apache/lucene/analysis/cn/smart/stopwords.txt"/>

       <filter class="solr.PorterStemFilterFactory"/>

       <filter class="solr.LowerCaseFilterFactory"/>

      </analyzer>

    </fieldType>

 

But, I get too much not relevant results.

 

i.e. : With the query (phone case):

tizh:(手機殼)

 

my query is translate to:

tizh:(手 OR 機 OR 殼)

 

But:

tizh:(手 AND 機 AND 殼)

returns 0 result.

 

And:

tizh:”手機殼”

returns also 0 result.

 

Is it possible to improve my fieldType ? or must I add something else ?

 

Thanks,

Bruno

 



-- 
L'absence de virus dans ce courrier electronique a ete verifiee par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

Reply via email to