Hello, Would really appreciate any inputs/suggestions on this. Thank you.
On Tue, Nov 24, 2009 at 10:59 PM, Rahul R <rahul.s...@gmail.com> wrote: > Hello, > In our application we have a catch-all field (the 'text' field) which is > cofigured as the default search field. Now this field will have a > combination of numbers, alphabets, special characters etc. I have a > requirement wherein the WordDelimiterFilterFactory does not work on numbers, > especially those with decimal points. Accuracy of results with relevance to > numerical data is quite important, So if the text field of a document has > data like "Bridge-Diode 3.55 Volts", I want to make sure that a search for > "355" or "35.5" does not retrieve this document. So I found the following > setting for the WordDelimiterFilterFactory to work for me (for most parts): > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" > generateNumberParts="0" catenateWords="1" catenateNumbers="0" > catenateAll="0" splitOnCaseChange="0" splitOnNumerics="0" > preserveOriginal="1"/> > > I am using the same setting for both index and query. > > Now the only problem is, if I have data like ".355". With the above > setting, the analysis jsp shows me that WordDelimiterFilterFactory is > creating term texts as both ".355' and "355". So a search for ".355" > retrieves documents containing both ".355" and "355". A search for "355" > also has the same effect. I noticed that when the entry for the > WordDelimiterFilterFactory was completely removed (both index and query), > then the above problem was resolved. But this seems too harsh a measure. > > Is there a way by which I can prevent the WordDelimiterFilterFactory from > totally acting on numerical data ? > > Regards > Rahul >