catenateWords should be set to true. Same goes for the index analyzer. 
preserveOriginal would also work.

> I have a field defined as:
>     <field name="content" type="text" indexed="true" stored="false"
> termVectors="true" multiValued="true" />
> where "text" is unmodified from the schema.xml example that came with Solr
> 1.4.1.
> 
> I have documents with some compound words indexed, words like Sandstone.
> And in several cases words that are camel case like MaxSize. If I query
> using all lower case, sandstone or maxsize, I get the documents I expect.
> If I query with proper case, ie. Sandstone or Maxsize I get the documents
> I expect. However, if the query is camel case, MaxSize or SandStone, it
> doesn't find the documents. In the case of MaxSize it is particularly
> frustrating because that is the actual case of the word that was indexed.
> Is this expected behavior?  The query analyzer definition the the "text"
> field type is:
> <analyzer type="query">
>   <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>   <filter class="solr.SynonymFilterFactory" ignoreCase="true" expand="true"
> synonyms="synonyms.txt"/>
>   <filter class="solr.StopFilterFactory" enablePositionIncrements="true"
> words="stopwords.txt" ignoreCase="true"/>
>   <filter class="solr.WordDelimiterFilterFactory" splitOnCaseChange="1"
> catenateAll="0" catenateNumbers="0" catenateWords="0"
> generateNumberParts="1" generateWordParts="1"/>
>   <filter class="solr.LowerCaseFilterFactory"/>
>   <filter language="English" class="solr.SnowballPorterFilterFactory"
> protected="protwords.txt"/>
> </analyzer>
> 
> Is the order by the filters important? If LowerCaseFilterFactory came
> before WordDelimiterFilterFactory, would that fix this? Would it break
> something else?
> 
> Thanks,
> Ken
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Compound-word-search-not-what-I-expecte
> d-tp3036089p3036089.html Sent from the Solr - User mailing list archive at
> Nabble.com.

Reply via email to