George - wildcard expressions, in Lucene/Solr's QueryParser, are not analyzed. There is one trick in the API that isn't yet wired to Solr's configuration, and that is setLowercaseExpandedTerms(true). This would solve the Sol* issue because when indexed all terms for the "text" field are lowercased during analysis.

An functional alternative, of course, is to have the client lowercase the query expression before requesting to Solr (careful, though - consider AND/OR/NOT).

        Erik



On Jul 1, 2008, at 8:14 PM, George Aroush wrote:

Hi Folks,

Can someone tell me what I might have setup wrong? After indexing my data, I can search just fine on, let say "sol*" but not on "Sol*" (note upper case
'S' vs. lower case 's') I get 0 hits.

Here is my customize schema.xml setting:

   <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
     <analyzer type="index">
       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
       <!-- in this example, we will only use synonyms at query time
       <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> -->
       <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
       <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
       <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
     </analyzer>
     <analyzer type="query">
       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
       <!-- <filter class="solr.SynonymFilterFactory"
synonyms="synonyms.txt" ignoreCase="true" expand="true"/> -->
       <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
       <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="0" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0"/>
       <filter class="solr.LowerCaseFilterFactory"/>
       <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
       <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
     </analyzer>
   </fieldType>

Btw, "Solr", "solr", "sOlr", etc. works. It's a problem with wild cards.

Thanks in advance.

-- George

Reply via email to