On Feb 2, 2010, at 8:53 PM, Olala wrote: > > Hi all! > > I have problem with Solr, and I hope everyboby in there can help me :) > > I want to search text without diacritic but Solr will response diacritic > text and without diacritic text. > > For example, I query "solr index", it will response "solr index", "sôlr > index", "sòlr index", "sólr indèx",... > > I was tried ASCIIFoldingFilter and ISOLatin1AccentFilterFactory but it is > not correct :(
What's not correct? Can you provide more detail? Is it not indexed correctly? You might look at the Analysis tool under the Solr admin area to see how it is processing your content during indexing and searching. > > My schema config: > > <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="0" generateNumberParts="0" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> > <filter class="solr.ASCIIFoldingFilterFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.SnowballPorterFilterFactory" language="English" > protected="protwords.txt"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="0" generateNumberParts="0" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.SnowballPorterFilterFactory" language="English" > protected="protwords.txt"/> > </analyzer> > </fieldType> You probably should strip diacritics during query time, too. -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem using Solr/Lucene: http://www.lucidimagination.com/search