Multiple Words in String

Chris Fauerbach Sat, 02 Apr 2011 11:22:10 -0700

Good afternoon everyone!
I am stumped, and I would love some help.    I'm new to solr/lucene,
but I have thrown myself into it, so I think I have a solid
understanding.   Using the analysis tool in the admin interface, I see
these words stemmed and processed as I assume they would be, so I'm
stuck.


In my index, I have two documents, each with a text field, and here
are example values

1) microsoft.com
2) micro soft

I want to do a search using microsoft or "micro soft" and find both.
I'm using the dismax interface, the fields are properly listed in the
config, and I can find both records, but never at the same time.
Here's my schema.xml for my text field, any thoughts on what I can do
to find these together?


    <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.StandardTokenizerFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"
preserveOriginal="1"/>
                <filter class="solr.SynonymFilterFactory"
synonyms="syn/index_synonyms.txt" ignoreCase="true" expand="true"/>
                <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
maxGramSize="15" side="front"/>
                <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
maxGramSize="15" side="back"/>
        <filter class="solr.SnowballPorterFilterFactory"
language="English" protected="protwords.txt"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
maxGramSize="15" side="front"/>
                <filter class="solr.EdgeNGramFilterFactory" minGramSize="2"
maxGramSize="15" side="back"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"
preserveOriginal="1"/>
        <filter class="solr.SnowballPorterFilterFactory"
language="English" protected="protwords.txt"/>
                
      </analyzer>
    </fieldType>

Multiple Words in String

Reply via email to