The reason for the issue you are seeing is the IDF component in te
score. IDF = inverse document frequency.
The document frequency is the number of times a document appears in the
index. The higher the document frequency, the mre common the term and
thus the less relevant it is. The document frequency is inverted to give
a higher number for more relevant terms.
Solr does not yet support distributed IDF. Therefore the document
frequency is a 3m shard will be higher (as a proportion of your index)
compared to your 30m shard, thus it ill score lower.
I am not aware of a multiplier you can use to fix this. There is a
distributed IDF ticket in JIRA, maybe that is mature enough and might
help you.
Upayavira
On Thu, Jun 20, 2013, at 01:56 AM, Learner wrote:
Hi,
Sorry if its a very basic question but I am pretty new to SolrCloud and I
am
trying to understand the underlying mechanism for calculating relevancy.
Currently we are using SOLR 3.6.X and we use shards to perform
distributed
searching. Our shards are not of equal size hence sometimes the results
are
not as we expected.
For ex: Shard 1 has 30 million documents, Shard 2 has 30 millon documents
and shard 3 has just 3 million documents (push indexing via message
queue).
When we do a search using shards, documents from shard 1 and shard 2 gets
higher priority compared to documents in shard 3 (since its smaller).
Currently we add index time boost when adding documents to shard 3 so
that
the documents from shard 3 also comes up (higher) in search results.
Now when using SolrCloud, say for example if one shard has person name
repeated 5 times (with different unique id) and we have one more same
person name in shard 2 (with diff id), and when we do a search how does
SOLR
calculate the score? Does it do something like constant scoring across
various shards in order to bring up the search results across various
shards? How does the score gets calculated.. Does the score of all 6
documents have same value(5 from shard 1 and 1 from shard 2 -if all the
fields have same value except for unique id)?
Thanks,
BB
--
View this message in context:
http://lucene.472066.n3.nabble.com/SolrCloud-Score-calculation-tp4071805.html
Sent from the Solr - User mailing list archive at Nabble.com.