Re: Boosting results

Peter Keegan Fri, 07 Nov 2008 10:00:27 -0800

If you sort first by score, keep in mind that the raw scores are very
precise and you could see many unique values in the result set. The
secondary sort field would only be used to break equal scores. We had to use
a custom comparator to 'smooth out' the scores to allow the second field to
take effect.


Peter


On Fri, Nov 7, 2008 at 11:17 AM, Scott Smith <[EMAIL PROTECTED]>wrote:

> Well, it's not like sorting hadn't occurred to me.  Unfortunately, what
> I recalled was that you could only sort results on one field (I do date
> sorted searches all the time in my application).  I should have gone
> back and looked.  My memory failed me as I can see that you can sort on
> multiple fields and "score" (aka relevancy) is one of the pseudo fields.
> That'll work.
>
> Thanks.
>
> Scott
>
> -----Original Message-----
> From: Erick Erickson [mailto:[EMAIL PROTECTED]
> Sent: Friday, November 07, 2008 5:59 AM
> To: java-user@lucene.apache.org
> Subject: Re: Boosting results
>
> duuuuh, sorting. I absolutely love it when I overlook the obvious <G>.
>
> [EMAIL PROTECTED]
>
> On Fri, Nov 7, 2008 at 4:58 AM, Michael McCandless <
> [EMAIL PROTECTED]> wrote:
>
> >
> > Couldn't you just do a single Query that sorts first by category and
> second
> > by relevance?
> >
> > Mike
> >
> >
> > Erick Erickson wrote:
> >
> >  It seems to me that the easiest thing would be to fire two queries
> and
> >> then just concatenate the results
> >>
> >> category:A AND body:fred
> >>
> >> category:B AND body:fred
> >>
> >>
> >> If you really, really didn't want to fire two queries, you could
> create
> >> filters on category A and category B and make a couple of
> >> passes through your results seeing if the returned documents were in
> >> the filter, but you'd still concatenate the results. Actually in your
> >> specific example you could make one filter on A.....
> >>
> >> You could also consider a custom scorer that, added 1,000,000 to
> every
> >> category A document.
> >>
> >> How much were you boosting by? What happens if you boost by a very
> large
> >> factor?
> >> As in ridiculously large?
> >>
> >> Best
> >> Erick
> >>
> >> On Thu, Nov 6, 2008 at 7:42 PM, Scott Smith
> <[EMAIL PROTECTED]
> >> >wrote:
> >>
> >>  I'm interested in comments on the following problem.
> >>>
> >>>
> >>>
> >>> I have a set of documents.  They fall into 3 categories.  Call these
> >>> categories A, B, and C.  Each document has an indexed, non-tokenized
> >>> field called "category" which contains A, B, or C (they are mutually
> >>> exclusive categories).
> >>>
> >>>
> >>>
> >>> All of the documents contain a field called "body" which contains a
> >>> bunch of text.  This field is indexed and tokenized.
> >>>
> >>>
> >>>
> >>> So, I want to do a search which looks something like:
> >>>
> >>>
> >>>
> >>> (category:A OR category:B) AND body:fred
> >>>
> >>>
> >>>
> >>> I want all of the category A documents to come before the category B
> >>> documents.  Effectively, I want to have the category A documents
> first
> >>> (sorted by relevancy) and then the category B documents after
> (sorted by
> >>> relevancy).
> >>>
> >>>
> >>>
> >>> I thought I could do this by boosting the category portion of the
> query,
> >>> but that doesn't seem to work consistently.  I was setting the boost
> on
> >>> the category A term to 1.0 and the boost on the category B term to
> 0.0.
> >>>
> >>>
> >>>
> >>> Any thoughts how to skin this?
> >>>
> >>>
> >>>
> >>> Scott
> >>>
> >>>
> >>>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

Re: Boosting results

Reply via email to