> The maxScore is 772 when I remove the
description.
> I suppose the actual question, then, is if a low relevancy score on one
field
hurts the rest of them / the cumulative score,

This depends a lot on how you're searching over these fields. Is this a
(e)dismax query? Or a lucene query? Something else?

Across fields there's query normalization, which attempts to take a sum of
squares of IDFs of the search terms across the fields being searched.
Adding/removing a field could impact query normalization.

By removing a field, you also likely remove a boolean clause. By removing
the clause, there's less of a chance the coordinating factor (known as
coord) would punish your relevancy score.

Otherwise, don't know -- perhaps you could give us more information on how
you're searching your documents? Perhaps a sample Solr URL that shows how
you're querying?

Cheers,
-- 
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
LLC | 240.476.9983 | http://www.opensourceconnections.com
Author: Relevant Search <http://manning.com/turnbull> from Manning
Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.
On Mon, May 18, 2015 at 1:57 PM, John Blythe <j...@curvolabs.com> wrote:

> Background:
> I'm using Solr as a mechanism for search for users, but before even getting
> to that point as a means of intelligent inference more or less. Product
> data comes in and we're hoping to match it to the correct known product
> without having to use the user for confirmation/search.
>
> Problem:
> I get a maxScore (with the correct result at the top) of 618.22626 using
> the manufacturer's name, the product number, and the product description.
> All of these items are coming from a previous purchaser so we have to
> account for manufacturer name variations, miskeying of product numbers, and
> variances of descriptions. The maxScore is 772 when I remove the
> description.
>
> My initial question is regarding relevancy scoring (
> https://wiki.apache.org/solr/SolrRelevancyFAQ). I get that many of the
> description's tokens will be found throughout the other documents, thus
> keeping the relevancy at bay per the IDF portion of the relevancy score. I
> suppose the actual question, then, is if a low relevancy score on one field
> hurts the rest of them / the cumulative score, or if it simply keep that
> field's contribution lower than it'd otherwise be. I thought it was the
> latter, but the results I mention above are making me think that the first
> scenario is actually the case.
>
> Based on what I hear about the above, a follow up question may be what in
> the world is wrong with my analyzer :)
>
> Thanks for any thoughts!
>
> Best,
> John
>

Reply via email to