The word delimiter filter is actually combining "100-001" into "100001". You have BOTH catenateNumbers AND catenateAll, so "100-R8989" should generate THREE tokens: the concatenated numbers 100", the concatenated words "R8989", and both numbers and words concatenated, "100R8989 ".

-- Jack Krupansky

-----Original Message----- From: EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)
Sent: Friday, August 8, 2014 3:27 PM
To: solr-user@lucene.apache.org
Subject: WordDelimiter

HI, I have a situation where I don't want to split the words, I am using the workdelimterfilter where it works good.

For eg. If I send to analyszer for 100-001 , it is not splitting the keyword, but if I send 100-R8989 then the worddelimiter filter to 100 | R9889, below is the filed analyzer and filter. Same thing using for Query time.

Let me know if I am missing something here.

<analyzer type="index">

<charFilter class="solr.HTMLStripCharFilterFactory" /> <tokenizer class="solr.WhitespaceTokenizerFactory"/>

<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />

<filter class="solr.LowerCaseFilterFactory"/>
                                 <filter class="solr.KStemFilterFactory"/>

<filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="0" splitOnCaseChange="0" splitOnNumerics="0" stemEnglishPossessive="0" catenateWords="1" catenateNumbers="1" catenateAll="1" preserveOriginal="0"/>

<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>

</analyzer>
  • WordDelimiter EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions)

Reply via email to