I working on a user-generated tagging feature. Some of the tags could be 
multi-lingual, mixng languages like English, Chinese, Japanese

I'd like to add auto-complete to help users to enter the tags. And I'd want to 
match in the middle of the tags as well.

For example, if a user types "guit" I want to suggest:
"guitar"
"electric guitar"
"电动guitar"
"guitar英雄"

And if a user types "吉他" I want to suggest:
"吉他Hero"
"electric吉他"
"古典吉他"


I'm thinking about using:

<fieldType name="autocomplete" class="solr.TextField" 
positionIncrementGap="100">
 <analyzer type="index">
   <tokenizer class="solr.KeywordTokenizerFactory"/>
   <filter class="solr.LowerCaseFilterFactory"/>
   <filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="15" />
 </analyzer>
 <analyzer type="query">
   <tokenizer class="solr.KeywordTokenizerFactory"/>
   <filter class="solr.LowerCaseFilterFactory"/>
 </analyzer>
</fieldType>

Would the above setup do what I want to do?

Also how would I deal with hyphens? For example I want an input or either 
"wi-f" or "wif" to match the tag "wi-fi". 

Would adding WordDelimiterFilterFactory to both "index" and "query" accomplish 
that?


Thanks.



Reply via email to