Well, it's not like sorting hadn't occurred to me.  Unfortunately, what
I recalled was that you could only sort results on one field (I do date
sorted searches all the time in my application).  I should have gone
back and looked.  My memory failed me as I can see that you can sort on
multiple fields and "score" (aka relevancy) is one of the pseudo fields.
That'll work.

Thanks.

Scott

-----Original Message-----
From: Erick Erickson [mailto:[EMAIL PROTECTED] 
Sent: Friday, November 07, 2008 5:59 AM
To: java-user@lucene.apache.org
Subject: Re: Boosting results

duuuuh, sorting. I absolutely love it when I overlook the obvious <G>.

[EMAIL PROTECTED]

On Fri, Nov 7, 2008 at 4:58 AM, Michael McCandless <
[EMAIL PROTECTED]> wrote:

>
> Couldn't you just do a single Query that sorts first by category and
second
> by relevance?
>
> Mike
>
>
> Erick Erickson wrote:
>
>  It seems to me that the easiest thing would be to fire two queries
and
>> then just concatenate the results
>>
>> category:A AND body:fred
>>
>> category:B AND body:fred
>>
>>
>> If you really, really didn't want to fire two queries, you could
create
>> filters on category A and category B and make a couple of
>> passes through your results seeing if the returned documents were in
>> the filter, but you'd still concatenate the results. Actually in your
>> specific example you could make one filter on A.....
>>
>> You could also consider a custom scorer that, added 1,000,000 to
every
>> category A document.
>>
>> How much were you boosting by? What happens if you boost by a very
large
>> factor?
>> As in ridiculously large?
>>
>> Best
>> Erick
>>
>> On Thu, Nov 6, 2008 at 7:42 PM, Scott Smith
<[EMAIL PROTECTED]
>> >wrote:
>>
>>  I'm interested in comments on the following problem.
>>>
>>>
>>>
>>> I have a set of documents.  They fall into 3 categories.  Call these
>>> categories A, B, and C.  Each document has an indexed, non-tokenized
>>> field called "category" which contains A, B, or C (they are mutually
>>> exclusive categories).
>>>
>>>
>>>
>>> All of the documents contain a field called "body" which contains a
>>> bunch of text.  This field is indexed and tokenized.
>>>
>>>
>>>
>>> So, I want to do a search which looks something like:
>>>
>>>
>>>
>>> (category:A OR category:B) AND body:fred
>>>
>>>
>>>
>>> I want all of the category A documents to come before the category B
>>> documents.  Effectively, I want to have the category A documents
first
>>> (sorted by relevancy) and then the category B documents after
(sorted by
>>> relevancy).
>>>
>>>
>>>
>>> I thought I could do this by boosting the category portion of the
query,
>>> but that doesn't seem to work consistently.  I was setting the boost
on
>>> the category A term to 1.0 and the boost on the category B term to
0.0.
>>>
>>>
>>>
>>> Any thoughts how to skin this?
>>>
>>>
>>>
>>> Scott
>>>
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to