configuring per-field similarity in Solr 4: the global similarity does not support it
Hello, I have Solr 4 configured with several fields using different similarity classes according to: http://wiki.apache.org/solr/SchemaXml#Similarity However, I get this error message: FieldType 'DFR' is configured with a similarity, but the global similarity does not support it: class org.apache.solr.search.similarities.DefaultSimilarityFactory Excerpt from schema.xml below. What I am trying to do is have any field that doesn't specify a similarity to use the default, but to set up 3 specific fields to use the DFR, IB, and BM25 similarities respectively. I think I'm missing something here. Can someone point me to documentation or examples? Tom Simplified schema.xml excerpt: fieldType name=CJKFullText class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType !--###-- !-- relevance rank testing -- fieldType name=DFR class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer similarity class=solr.DFRSimilarityFactory str name=basicModelI(F)/str str name=afterEffectB/str str name=normalizationH2/str /similarity /fieldType fieldType name=IB class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer similarity class=solr.IBSimilarityFactory str name=distributionSPL/str str name=lambdaDF/str str name=normalizationH2/str /similarity /fieldType fieldType name=BM25 class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer similarity class=solr.BM25SimilarityFactory !-- start with the defaults -- float name=k11.2/float float name=b0.75/float /similarity /fieldType ===- Excerpt from actual schema.xml fieldType name=CJKFullText class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ filter class=solr.CJKBigramFilterFactory han=true hiragana=true katakana=false hangul=false / filter class=solr.CommonGramsFilterFactory words=1000common.txt / /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ filter class=solr.CJKBigramFilterFactory han=true hiragana=true katakana=false hangul=false / filter class=solr.CommonGramsQueryFilterFactory words=1000common.txt / /analyzer /fieldType !--###-- !-- relevance rank testing -- fieldType name=DFR class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ filter class=solr.CJKBigramFilterFactory han=true hiragana=true katakana=false hangul=false / filter class=solr.CommonGramsFilterFactory words=1000common.txt / /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ filter class=solr.CJKBigramFilterFactory han=true hiragana=true katakana=false hangul=false / filter class=solr.CommonGramsQueryFilterFactory words=1000common.txt / /analyzer similarity class=solr.DFRSimilarityFactory str name=basicModelI(F)/str str name=afterEffectB/str str name=normalizationH2/str /similarity /fieldType fieldType name=IB class=solr.TextField positionIncrementGap=100
RE: configuring per-field similarity in Solr 4: the global similarity does not support it
Hi Tom, The global similarity must be able to delegate similarity to your per-field setting. Solr has the SchemaSimilarityFactory that can do this. Please replace your global similarity with: similarity class=solr.SchemaSimilarityFactory/ Keep in mind that coord and queryNorm (=1.0f) are not implemented now, so you will get different scores for TF-IDF! Cheers, -Original message- From:Tom Burton-West tburt...@umich.edu Sent: Mon 17-Dec-2012 23:11 To: solr-user@lucene.apache.org Subject: configuring per-field similarity in Solr 4: quot;the global similarity does not support itquot; Hello, I have Solr 4 configured with several fields using different similarity classes according to: http://wiki.apache.org/solr/SchemaXml#Similarity However, I get this error message: FieldType 'DFR' is configured with a similarity, but the global similarity does not support it: class org.apache.solr.search.similarities.DefaultSimilarityFactory Excerpt from schema.xml below. What I am trying to do is have any field that doesn't specify a similarity to use the default, but to set up 3 specific fields to use the DFR, IB, and BM25 similarities respectively. I think I'm missing something here. Can someone point me to documentation or examples? Tom Simplified schema.xml excerpt: fieldType name=CJKFullText class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer /fieldType !--###-- !-- relevance rank testing -- fieldType name=DFR class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer similarity class=solr.DFRSimilarityFactory str name=basicModelI(F)/str str name=afterEffectB/str str name=normalizationH2/str /similarity /fieldType fieldType name=IB class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer similarity class=solr.IBSimilarityFactory str name=distributionSPL/str str name=lambdaDF/str str name=normalizationH2/str /similarity /fieldType fieldType name=BM25 class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ /analyzer similarity class=solr.BM25SimilarityFactory !-- start with the defaults -- float name=k11.2/float float name=b0.75/float /similarity /fieldType ===- Excerpt from actual schema.xml fieldType name=CJKFullText class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ filter class=solr.CJKBigramFilterFactory han=true hiragana=true katakana=false hangul=false / filter class=solr.CommonGramsFilterFactory words=1000common.txt / /analyzer analyzer type=query tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ filter class=solr.CJKBigramFilterFactory han=true hiragana=true katakana=false hangul=false / filter class=solr.CommonGramsQueryFilterFactory words=1000common.txt / /analyzer /fieldType !--###-- !-- relevance rank testing -- fieldType name=DFR class=solr.TextField positionIncrementGap=100 autoGeneratePhraseQueries=false analyzer type=index tokenizer class=solr.ICUTokenizerFactory/ filter class=solr.ICUFoldingFilterFactory/ filter class=solr.CJKBigramFilterFactory han=true