Relevancy Scoring

John Blythe Mon, 18 May 2015 10:59:12 -0700

Background:
I'm using Solr as a mechanism for search for users, but before even getting
to that point as a means of intelligent inference more or less. Product
data comes in and we're hoping to match it to the correct known product
without having to use the user for confirmation/search.


Problem:
I get a maxScore (with the correct result at the top) of 618.22626 using
the manufacturer's name, the product number, and the product description.
All of these items are coming from a previous purchaser so we have to
account for manufacturer name variations, miskeying of product numbers, and
variances of descriptions. The maxScore is 772 when I remove the
description.

My initial question is regarding relevancy scoring (
https://wiki.apache.org/solr/SolrRelevancyFAQ). I get that many of the
description's tokens will be found throughout the other documents, thus
keeping the relevancy at bay per the IDF portion of the relevancy score. I
suppose the actual question, then, is if a low relevancy score on one field
hurts the rest of them / the cumulative score, or if it simply keep that
field's contribution lower than it'd otherwise be. I thought it was the
latter, but the results I mention above are making me think that the first
scenario is actually the case.

Based on what I hear about the above, a follow up question may be what in
the world is wrong with my analyzer :)

Thanks for any thoughts!

Best,
John

Relevancy Scoring

Reply via email to