I have the following setup:

        <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
            <analyzer>
                <tokenizer class="solr.StandardTokenizerFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
        </fieldType>
        <field name="description"    type="text"   indexed="true"
stored="true"   multiValued="false" omitNorms="true" />

I index my corpus, and I can see tf is as usual, in this doc is 14 times in
this field:
4.5094776 = (MATCH) weight(description:galaxy^10.0 in 440)
[DefaultSimilarity], result of:
      4.5094776 = score(doc=440,freq=14.0 = termFreq=14.0), product of:
        0.14165252 = queryWeight, product of:
          10.0 = boost
          8.5082035 = idf(docFreq=30, maxDocs=56511)
          0.0016648936 = queryNorm
        31.834784 = fieldWeight in 440, product of:
          3.7416575 = tf(freq=14.0), with freq of:
            14.0 = termFreq=14.0
          8.5082035 = idf(docFreq=30, maxDocs=56511)
          1.0 = fieldNorm(doc=440)


Then I modify my schema:

    <similarity class="solr.SchemaSimilarityFactory"/>
        <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
            <analyzer>
                <tokenizer class="solr.StandardTokenizerFactory"/>
                <filter class="solr.LowerCaseFilterFactory"/>
            </analyzer>
            <similarity class="com.customsolr.NoTfSimilarityFactory"/>
        </fieldType>

I just want to disable term freq > 1, so a term its either present or not.

public class NoTfSimilarity extends DefaultSimilarity {
        public float tf(float freq) {
                return freq > 0 ? 1.0f : 0.0f;
        }
}

But I still see tf=14 in my query??
723.89526 = (MATCH) weight(description:galaxy^10.0 in 440) [], result of:
        723.89526 = score(doc=440,freq=14.0 = termFreq=14.0), product of:
          85.08203 = queryWeight, product of:
            10.0 = boost
            8.5082035 = idf(docFreq=30, maxDocs=56511)
            1.0 = queryNorm
          8.5082035 = fieldWeight in 440, product of:
            1.0 = tf(freq=14.0), with freq of:
              14.0 = termFreq=14.0
            8.5082035 = idf(docFreq=30, maxDocs=56511)
            1.0 = fieldNorm(doc=440)

anyone sees what I am missing?
I am on solr4.0

thanks
xavier

Reply via email to