My colleague, after some digging, found in SolrQueryParser
(around line 62) setLowercaseExpandedTerms(false); The default for Lucene is true. Was this intentional? Or an oversight? Perhaps it's not related to my problem, but it seems that it might be. Thanks in advance! On 4/26/07, Michael Kimsal <[EMAIL PROTECTED]> wrote:
type:changelog AND ( ( (listing:Fox) or (listing:Fox*) or (listing:*Fox) ) ) and type:changelog AND ( ( (listing:fox) or (listing:fox*) or (listing:*fox) ) ) Is this to do with the wildcards? Actually, I've just answered my own question. type:changelog AND ( ( (listing:fox) ) ) and type:changelog AND ( ( (listing:Fox) ) ) give the same results. But adding in the or listing:fox* or listing:*fox is always case-sensitive. However, http://wiki.apache.org/lucene-java/LuceneFAQ#head-133cf44dd3dff3680c96c1316a663e881eeac35aseems to say that wildcard searches are not case-sensitive. Unless someone can point out a way around this, it seems I'll need to manually reindex and lower-case everything on the way in, then reformat my search queries to be lower-case as well. On 4/26/07, Michael Kimsal <[EMAIL PROTECTED]> wrote: > > I was just writing a followup. > > I'm using the default text field type > > <fieldtype name="text" class="solr.TextField" positionIncrementGap="100"> > <analyzer type="index"> > > > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <!-- in this example, we will only use synonyms at query time > <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> > > > --> > <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/> > > > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/> > <filter class=" > > solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class=" > > solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" words=" > > stopwords.txt"/> > <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/> > > > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/> > <filter class=" > > solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldtype> > > > That looks to me like it's got LowerCaseFilterFactory in the query > analyzer and the index analyzer. > > I'm still digging in to this, but are there any other things to look for > anyone can point me to? (Thanks Erik!) > > > > > On 4/26/07, Erik Hatcher <[EMAIL PROTECTED]> wrote: > > > > > > On Apr 26, 2007, at 5:43 PM, Michael Kimsal wrote: > > > I've looked through the mailing lists and can't find much of > > anything > > > regarding case sensitivity. It > > > seems SOLR is case sensitive by default - I'm using the default > > > settings > > > with a very basic schema - just text fields. > > > > All depends on the analysis you have set up for the fields. If > > you're indexing "string"-type fields in the default example schema, > > there is effectively no analysis so searches must be exact matches > > case and all. > > > > > Is there any way to tell the query parser to be case insensitive > > > during a > > > query? Or do I have to reindex > > > all my data again with lowercase values? > > > > Terms are indexed in a case-sensitive manner, so if you need case > > insensitivity you need to lowercase on the way in and on querying. > > > > Erik > > > > > > > > > -- > Michael Kimsal > http://webdevradio.com > -- Michael Kimsal http://webdevradio.com
-- Michael Kimsal http://webdevradio.com