I am indexing a column in a database. I have chosen field type of text for
this column (this type was defined in the sample schema file which comes in
the Solr Example).

When I search for the word "impress" and top 3 results. I get these 3
documents

<str name="TEXT">bare desire pronounce villainy draught beasts blockish
impression acquit</str> 
<str name="TEXT">bare impression villainy pronounce beasts desire blockish
draught acquit</str> 
<str name="TEXT">beasts desire villainy pronounce bare acquit impression
draught blockish</str> 

But here the TEXT doesn't really contain the word "impress" it contains the
word "impression"

Now the database does contain a few rows where the word "impress" is there,
but those rows do not come in top 3 results.

So my question is that why did the rows containing the word "impression" got
ranked higher than the rows containing the word "impress" when I searched
for "impress"?

My field type Text is defined as follows in the schema.

    <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>



-- 
View this message in context: 
http://old.nabble.com/Confused-by-Solr-Ranking-tp27834227p27834227.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to