Cameron VandenBerg created SOLR-15319: -----------------------------------------
Summary: ExactStatsCache not always producing Distributed IDF Key: SOLR-15319 URL: https://issues.apache.org/jira/browse/SOLR-15319 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Reporter: Cameron VandenBerg I want a Distributed IDF across all parts of the collection so I have added this line to my solrconfig.xml: {color:#000080}<{color}{color:#000080}statsCache{color}{color:#000080} {color}{color:#008080}class{color}{color:#000080}={color}{color:#dd1144}"org.apache.solr.search.stats.ExactStatsCache"{color}{color:#000080} />{color} This seems to work about 90% of the time, but if I run the same request over and over again, sometimes I get scores with a local IDF for just one part of the collection. Here is a request example: /solr/collection1,collection2/query?q=fulltext:shark&rows=500&fl=id,url,title,score&sort=score+desc I still get documents from both collection1 and collection2, but sometimes I get scores that are the same as when I would just query collection1. I believe that it is only using the document frequency of collection one for the term in that case. It looks like this issue is specifically related to multi-collection requests (i.e., I don't observe this issue for a request against a single collection). Checking `docCount` in the score "explain" (with `debug=true`), it looks like multi-collection requests pick one collection or the other (apparently non-deterministically?) when retrieving distributed `docCount` for idf calculation. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org