Thank you a lot, Walter. I removed most of the filters and now it returns the same number of results. It looks simply this way:
<fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> </analyzer> </fieldType> Can I ask you another question: I have Magento + Solr and have a requirement to create an admin magento module, where I can add/remove synonyms dynamically. Is this possible? I searched google but it seems not possible. Regards Plamen 2013/3/29 Walter Underwood <wun...@wunderwood.org> > There are several problems with this config. > > Indexing uses the phonetic filter, but query does not. This almost > guarantees that nothing will match. Numbers could match, if the filter > passes them. > > Query time has two stopword filters with different lists. Indexing only > has one. This isn't fatal, but it is pretty weird. Is letterstops.txt > trying to do the same thing as the length filter? If so, use the length > filter both place. Or not at all. Deleting single all single characters is > a bad idea. You'll never find "Vitamin C". > > The same synonyms are used at index and query time, which is unnecessary. > Only use synonyms at index time unless you really know what you are doing > and have a special need. > > wunder > > On Mar 29, 2013, at 9:53 AM, Plamen Mihaylov wrote: > > > Guys, > > > > This is a commented line where expand is false. I moved the synonym > filter > > after tokenizer, but the result is the same. > > > > Actual configuration: > > > > <fieldType name="text" class="solr.TextField" > > positionIncrementGap="100"> > > <analyzer type="index"> > > <tokenizer class="solr.WhitespaceTokenizerFactory" /> > > <filter class="solr.SynonymFilterFactory" > > synonyms="synonyms.txt" ignoreCase="true" expand="true"/> > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > words="stopwords.txt" enablePositionIncrements="true" /> > > <filter class="solr.WordDelimiterFilterFactory" > > generateWordParts="1" generateNumberParts="1" catenateWords="1" > > catenateNumbers="1" catenateAll="0" > > splitOnCaseChange="1" /> > > <filter class="solr.LowerCaseFilterFactory" /> > > <filter class="solr.PhoneticFilterFactory" > > encoder="DoubleMetaphone" inject="true" /> > > <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> > > <filter class="solr.LengthFilterFactory" min="2" max="100" > > /> > > <!-- <filter class="solr.SnowballPorterFilterFactory" > > language="English" /> --> > > </analyzer> > > <analyzer type="query"> > > <tokenizer class="solr.WhitespaceTokenizerFactory" /> > > <filter class="solr.SynonymFilterFactory" > > synonyms="synonyms.txt" ignoreCase="true" expand="true" /> > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > words="stopwords.txt" /> > > <filter class="solr.WordDelimiterFilterFactory" > > generateWordParts="1" generateNumberParts="1" catenateWords="0" > > catenateNumbers="0" catenateAll="0" /> > > <filter class="solr.LowerCaseFilterFactory" /> > > <!-- <filter class="solr.EnglishPorterFilterFactory" > > protected="protwords.txt"/> --> > > <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> > > <filter class="solr.StopFilterFactory" ignoreCase="true" > > words="letterstops.txt" enablePositionIncrements="true" /> > > </analyzer> > > </fieldType> > > > > 2013/3/29 Walter Underwood <wun...@wunderwood.org> > > > >> Also, all the filters need to be after the tokenizer. There are two > >> synonym filters specified, one before the tokenizer and one after. > >> > >> I'm surprised that works at all. Shouldn't that be fatal error when > >> loading the config? > >> > >> wunder > >> > >> On Mar 29, 2013, at 9:33 AM, Thomas Krämer | ontopica wrote: > >> > >>> Hi Plamen > >>> > >>> You should set expand to true during > >>> > >>> <analyzer type="index"> > >>> .... > >>> <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" > >>> ignoreCase="true" expand="true"/> > >>> > >>> > >>> ... > >>> > >>> Greetings, > >>> > >>> Thomas > >>> > >>> Am 29.03.2013 17:16, schrieb Plamen Mihaylov: > >>>> Hey guys, > >>>> > >>>> I have the following problem - I have a website with sport players, > >> where > >>>> using Solr indexing their data. I have defined synonyms like: NY, New > >> York. > >>>> When I search for New York - there are 145 results found, but when I > >> search > >>>> for NY - there are 142 results found. Why there is a diff and how can > I > >> fix > >>>> this? > >>>> > >>>> Configuration snippets: > >>>> > >>>> synonyms.txt > >>>> > >>>> ... > >>>> NY, New York > >>>> ... > >>>> > >>>> ------ > >>>> schema.xml > >>>> > >>>> ... > >>>> <fieldType name="text" class="solr.TextField" > >>>> positionIncrementGap="100"> > >>>> <analyzer type="index"> > >>>> <filter class="solr. > >>>> SynonymFilterFactory" synonyms="synonyms.txt" > >>>> ignoreCase="true" expand="true"/> > >>>> <tokenizer class="solr.WhitespaceTokenizerFactory" /> > >>>> <!-- we will only use synonyms at query time <filter > >>>> class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" > >>>> ignoreCase="true" expand="false"/> --> > >>>> > >>>> <filter class="solr.StopFilterFactory" ignoreCase="true" > >>>> words="stopwords.txt" enablePositionIncrements="true" /> > >>>> <filter class="solr.WordDelimiterFilterFactory" > >>>> generateWordParts="1" generateNumberParts="1" catenateWords="1" > >>>> catenateNumbers="1" catenateAll="0" > >>>> splitOnCaseChange="1" /> > >>>> <filter class="solr.LowerCaseFilterFactory" /> > >>>> <filter class="solr.PhoneticFilterFactory" > >>>> encoder="DoubleMetaphone" inject="true" /> > >>>> <filter class="solr.RemoveDuplicatesTokenFilterFactory" > >> /> > >>>> <filter class="solr.LengthFilterFactory" min="2" > >> max="100" > >>>> /> > >>>> <!-- <filter class="solr.SnowballPorterFilterFactory" > >>>> language="English" /> --> > >>>> </analyzer> > >>>> <analyzer type="query"> > >>>> <filter class="solr.SynonymFilterFactory" > >>>> synonyms="synonyms.txt" ignoreCase="true" expand="true" /> > >>>> <tokenizer class="solr.WhitespaceTokenizerFactory" /> > >>>> > >>>> <filter class="solr.StopFilterFactory" ignoreCase="true" > >>>> words="stopwords.txt" /> > >>>> <filter class="solr.WordDelimiterFilterFactory" > >>>> generateWordParts="1" generateNumberParts="1" catenateWords="0" > >>>> catenateNumbers="0" catenateAll="0" /> > >>>> <filter class="solr.LowerCaseFilterFactory" /> > >>>> <!-- <filter class="solr.EnglishPorterFilterFactory" > >>>> protected="protwords.txt"/> --> > >>>> <filter class="solr.RemoveDuplicatesTokenFilterFactory" > >> /> > >>>> <filter class="solr.StopFilterFactory" ignoreCase="true" > >>>> words="letterstops.txt" enablePositionIncrements="true" /> > >>>> </analyzer> > >>>> </fieldType> > >>>> > >>>> > >>>> Thanks in advance. > >>>> Plamen > >>>> > >>> > >>> > >>> -- > >>> > >>> ontopica GmbH > >>> Prinz-Albert-Str. 2b > >>> 53113 Bonn > >>> Germany > >>> fon: +49-228-227229-22 > >>> fax: +49-228-227229-77 > >>> web: http://www.ontopica.de > >>> ontopica GmbH > >>> Sitz der Gesellschaft: Bonn > >>> > >>> Geschäftsführung: Thomas Krämer, Christoph Okpue > >>> Handelsregister: Amtsgericht Bonn, HRB 17852 > >>> > >>> > >> > > > > > -- Поздрави Пламен Михайлов