WordDelimiterFilterFactory is rather a specialized and capricious beast. Possibly not the most suitable for your needs (it's for things like "iPhone 6" == "iphone6").
Things you may want to look at: http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/ngram/EdgeNGramFilterFactory.html http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/core/StopFilterFactory.html http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/commongrams/CommonGramsFilterFactory.html http://www.solr-start.com/javadoc/solr-lucene/org/apache/lucene/analysis/commongrams/CommonGramsQueryFilter.html Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 28 September 2014 22:50, PeterKerk <petervdk...@hotmail.com> wrote: > I have a site which lists companies. > > I'm looking to improve my search, but I want to know which available > analysers and tokenizers I should use for which scenario, and if it's at all > possible. > > I want users to be able to search on the company title on for example a > company called "The Royal Garden" > > The logic for this search should be as follows, "The Royal Garden", should > be found on queries: > "the royal garden" > "royal garden" > "the roy" > "The royal" > "RoYAl" > "garden" > > So case insensitive, matching on parts of words. > > However, a query "the royal" should not return companies like: > "the wall" > "the room" > "the restaurant" > > So words like "the", but also "a" should be ignored if these are the only > match in the searchquery. > > I now have this: > > <fieldType name="searchtext" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType> > > > <field name="title_search" type="searchtext" indexed="true" > stored="true"/> > > I'm testing on http://localhost:8983/solr/#/bm/analysis but I'm stuck. > > Also, I would think my scenario is pretty common and lots of users have > already configured their Solr search to be flexible and powerful...any good > search configurations would be welcome! > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Flexible-search-field-analyser-tokenizer-configuration-tp4161624.html > Sent from the Solr - User mailing list archive at Nabble.com.