Re: Search wihthout diacritics

Grant Ingersoll Wed, 03 Feb 2010 13:53:27 -0800

On Feb 2, 2010, at 8:53 PM, Olala wrote:

> 
> Hi all!
> 
> I have problem with Solr, and I hope everyboby in there can help me :)
> 
> I want to search text without diacritic but Solr will response diacritic
> text and without diacritic text.
> 
> For example, I query "solr index", it will response "solr index", "sôlr
> index", "sòlr index", "sólr indèx",...
> 
> I was tried ASCIIFoldingFilter and ISOLatin1AccentFilterFactory but it is
> not correct :(


What's not correct?  Can you provide more detail?  Is it not indexed correctly? 
 You might look at the Analysis tool under the Solr admin area to see how it is 
processing your content during indexing and searching.

> 
> My schema config:
> 
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
>      <analyzer type="index">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>       
>        <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="0" generateNumberParts="0" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
>         <filter class="solr.ASCIIFoldingFilterFactory"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.SnowballPorterFilterFactory" language="English"
> protected="protwords.txt"/>
>      </analyzer>
>      <analyzer type="query">
>        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
>        <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="0" generateNumberParts="0" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
>        <filter class="solr.LowerCaseFilterFactory"/>
>        <filter class="solr.SnowballPorterFilterFactory" language="English"
> protected="protwords.txt"/>
>      </analyzer>
>    </fieldType>

You probably should strip diacritics during query time, too.

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem using Solr/Lucene: 
http://www.lucidimagination.com/search

Re: Search wihthout diacritics

Reply via email to