You might just need some syntax help. Not sure what the Solr admin escapes,
but many of the text in your query actually have reserved meaning. Also,
when a term appears without a fieldName:value directly in front of it, I
believe its going to search the default field (it's no longer attached to
the field). You need to use parens to attach multiple terms to that field
for search.

I'd try to see if doing any of the following help:

Add parens to group terms to the field:

mfgname2:(Ben & Jerry's) +descript1:(Strawberry Shortcake Ice Cream 1.5pt) +
productnumber:(001-029-1298)

Also keep in mind "+" means mandatory, and its an operator on just one
field. So in the above you're requiring description and product number
match the provided terms.

Further, you may need to escape the "-" as that means "NOT". You can do
that with the following:
mfgname2:(Ben & Jerry's) +descript1:(Strawberry Shortcake Ice Cream 1.5pt) +
productnumber:(001\-029\-1298)

You can read more in the article on Solr query syntax
https://wiki.apache.org/solr/SolrQuerySyntax

Hope that helps, for all I know your cut and paste didn't work and I'm
assuming you have syntax issues :)

-Doug

On Mon, May 18, 2015 at 2:25 PM, John Blythe <j...@curvolabs.com> wrote:

> Hey Doug,
>
> Thanks for the quick reply.
>
> No edismax just yet. Planning on getting there, but have been trying to
> fine tune the 3 primary fields we use over the last week or so before
> jumping into edismax and its nifty toolset to help push our accuracy and
> precision even further (aside: is this a good strategy?)
>
> For now I'm querying directly in the admin interface, doing something like
> this:
> mfgname2: Ben & Jerry's + descript1: Strawberry Shortcake Ice Cream 1.5pt +
> productnumber: 001-029-1298
>
> versus
> mfgname2: Ben & Jerry's + descript1: Strawberry Shortcake Ice Cream 1.5pt
>
> Another interesting and likely related factor is the description's lack of
> help. With the product number in place it gets nailed even with stray
> zeros, 4's instead of 1's, etc.
>
> Without it, though, the querying just flat out sucks. For instance, I just
> saw something akin to this:
> mfgname2: Ben & Jerry's + descript1: Straw Shortcake Ice Cream 1.5pt
>
> that got nowhere near what it should have. Straw would have a synonym to
> map to strawberry and would match the document's description *exactly, *yet
> Solr would push out all sorts of peripheral suggestions that didn't match
> strawberry or was a different amount (.75pt, for instance). I know I'm no
> expert, but I was thinking my analyzer was a bit better than that :p
>
> --
> *John Blythe*
> Product Manager & Lead Developer
>
> 251.605.3071 | j...@curvolabs.com
> www.curvolabs.com
>
> 58 Adams Ave
> Evansville, IN 47713
>
> On Mon, May 18, 2015 at 2:18 PM, Doug Turnbull <
> dturnb...@opensourceconnections.com> wrote:
>
> > > The maxScore is 772 when I remove the
> > description.
> > > I suppose the actual question, then, is if a low relevancy score on one
> > field
> > hurts the rest of them / the cumulative score,
> >
> > This depends a lot on how you're searching over these fields. Is this a
> > (e)dismax query? Or a lucene query? Something else?
> >
> > Across fields there's query normalization, which attempts to take a sum
> of
> > squares of IDFs of the search terms across the fields being searched.
> > Adding/removing a field could impact query normalization.
> >
> > By removing a field, you also likely remove a boolean clause. By removing
> > the clause, there's less of a chance the coordinating factor (known as
> > coord) would punish your relevancy score.
> >
> > Otherwise, don't know -- perhaps you could give us more information on
> how
> > you're searching your documents? Perhaps a sample Solr URL that shows how
> > you're querying?
> >
> > Cheers,
> > --
> > *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
> > LLC | 240.476.9983 | http://www.opensourceconnections.com
> > Author: Relevant Search <http://manning.com/turnbull> from Manning
> > Publications
> > This e-mail and all contents, including attachments, is considered to be
> > Company Confidential unless explicitly stated otherwise, regardless
> > of whether attachments are marked as such.
> > On Mon, May 18, 2015 at 1:57 PM, John Blythe <j...@curvolabs.com> wrote:
> >
> > > Background:
> > > I'm using Solr as a mechanism for search for users, but before even
> > getting
> > > to that point as a means of intelligent inference more or less. Product
> > > data comes in and we're hoping to match it to the correct known product
> > > without having to use the user for confirmation/search.
> > >
> > > Problem:
> > > I get a maxScore (with the correct result at the top) of 618.22626
> using
> > > the manufacturer's name, the product number, and the product
> description.
> > > All of these items are coming from a previous purchaser so we have to
> > > account for manufacturer name variations, miskeying of product numbers,
> > and
> > > variances of descriptions. The maxScore is 772 when I remove the
> > > description.
> > >
> > > My initial question is regarding relevancy scoring (
> > > https://wiki.apache.org/solr/SolrRelevancyFAQ). I get that many of the
> > > description's tokens will be found throughout the other documents, thus
> > > keeping the relevancy at bay per the IDF portion of the relevancy
> score.
> > I
> > > suppose the actual question, then, is if a low relevancy score on one
> > field
> > > hurts the rest of them / the cumulative score, or if it simply keep
> that
> > > field's contribution lower than it'd otherwise be. I thought it was the
> > > latter, but the results I mention above are making me think that the
> > first
> > > scenario is actually the case.
> > >
> > > Based on what I hear about the above, a follow up question may be what
> in
> > > the world is wrong with my analyzer :)
> > >
> > > Thanks for any thoughts!
> > >
> > > Best,
> > > John
> > >
> >
>



-- 
*Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections,
LLC | 240.476.9983 | http://www.opensourceconnections.com
Author: Relevant Search <http://manning.com/turnbull> from Manning
Publications
This e-mail and all contents, including attachments, is considered to be
Company Confidential unless explicitly stated otherwise, regardless
of whether attachments are marked as such.

Reply via email to