Hi, With this article ( https://opensourceconnections.com/blog/2011/12/23/indexing-chinese-in-solr/ ), I begin to understand what happens.
Is someone have already try, with a recent SOLR, the Poading algorithm? Thanks, Bruno -----Message d'origine----- De : Bruno Mannina [mailto:bmann...@free.fr] Envoyé : dimanche 10 janvier 2021 17:57 À : solr-user@lucene.apache.org Objet : [solr8.7] not relevant results for chinese query Hello, I try to use chinese language with my index. My definition is: <field name="tizh" type="text_zh" multiValued="true" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true"/> <!-- Simplified chinese --> <!-- BRUNO --> <fieldType name="text_zh" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.HMMChineseTokenizerFactory"/> <filter class="solr.CJKWidthFilterFactory"/> <filter class="solr.StopFilterFactory" words="org/apache/lucene/analysis/cn/smart/stopwords.txt"/> <filter class="solr.PorterStemFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> But, I get too much not relevant results. i.e. : With the query (phone case): tizh:(手機殼) my query is translate to: tizh:(手 OR 機 OR 殼) But: tizh:(手 AND 機 AND 殼) returns 0 result. And: tizh:”手機殼” returns also 0 result. Is it possible to improve my fieldType ? or must I add something else ? Thanks, Bruno -- L'absence de virus dans ce courrier electronique a ete verifiee par le logiciel antivirus Avast. https://www.avast.com/antivirus -- L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel antivirus Avast. https://www.avast.com/antivirus