Hi Laurent,

I use the copy field approach and copy the text fields to a custom type
"text_exact" that I define in my schema.xml. This allows searching for
"exact matches" anywhere within the text field, which doesn't use tokens
injected by stemming, synonyms or other index-time filters. 

In my application code, I detect when users are performing an exact
match and set up the underlying solr query to use the "text_exact"
fields by specifying to use an exact match request handler (a modified
definition of the standard dismax request handler in solrconfig.xml) 

Depending on your needs, you might want to do some sort of minimal
analysis on the field (ignore punctuation, lowercase,...) Here's the
text_exact field that I use:

    <fieldtype name="text_exact" class="solr.TextField"
positionIncrementGap="100">
        <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="0" catenateWords="0"
catenateNumbers="0" catenateAll="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldtype>

-- Dean

-----Original Message-----
From: Vauthrin, Laurent [mailto:laurent.vauth...@disney.com] 
Sent: 20/03/2009 6:13 AM
To: solr-user@lucene.apache.org
Subject: Exact Match

Hello again,

 

I believe that this question has been posed before but I just wanted to
make sure I understood my options.  Here's the situation:

 

We have a few fields that are specified as 'text' and a few field that
are specified as 'string'.  As far as I understand, 'string' will do
exact matches whereas 'text' will do tokenized/contains matches.
However, we have a need to do exact matches on the 'text' field as well.

 

I believe I've seen two approaches for this problem:

1.       Using a copyField configuration and copy the 'text' field to a
'string' field.  Then use the string field when exact matches are
needed.

2.       Append something like '_start_' and '_end_' to the field at
index and search time for exact matches.

 

Are there any solutions to this problem that don't require creating
another field or modifying the data?  (i.e. some sort of query filter?)

 

Thanks,
Laurent


CLSA CLEAN & GREEN: Please consider our environment before printing this email.
The content of this communication is subject to CLSA Legal and Regulatory 
Notices. 
These can be viewed at https://www.clsa.com/disclaimer.html or sent to you upon 
request.


Reply via email to