Thanks for the reply Erik

Sorry for being vague.  To be clear we have 1-2 million records, and
rough 12000-14000 groups.
Each record is in one and only one group.

I see it working something like this

1.  Identify all records that would match search terms.  (Suppose I
search for 'dog', and get 450,000 matches)
2.  Of those records, find the distinct list of groups over all the
matches.  (Suppose there are 300.)
3.  Now get the top ranked record from each group, as if you search
just for docs in the group.

Your response has me thinking this is a hard nut to crack.  I'm
wondering if there is a way to structure ranking to get us close on
this one?

thanks
gene




On Wed, Sep 17, 2008 at 8:39 AM, Erik Hatcher
<[EMAIL PROTECTED]> wrote:
> Personally, I'd send three requests for solr, one for each group.
>  &rows=1&fq=category:A ... and so on.
>
> But that'd depend on how many groups you have.
>
> One can always hack custom request handlers to do this sort of thing all as
> a single request, but I'd guess it ain't that much slower to just make 3
> requests.  And there are fancier solutions out there that might fit as well,
> like the field collapsing patch.
>
>        Erik
>
> On Sep 16, 2008, at 4:13 PM, ristretto.rb wrote:
>
>> Hello All,
>>
>> I'm looking for a way to filter results by some ranking mechanism.
>> For example...
>>
>> Suppose you have 30 docs in an index, and they are in groups of 10, like
>> this
>>
>> A, 1
>> A, 2
>> :
>> A, 10
>>
>> B, 1
>> B, 2
>> :
>> B, 10
>>
>> C, 1
>> C, 2
>> :
>> C, 10
>>
>> I would like to get 3 records back such that I get a single,  "best",
>> result from each logical group.
>> So, if I searched with a term that would match all the docs in the
>> index, I could be certain to get
>> a doc with A in it, one with B in it and one with C in it.
>>
>> The the moment, I have a solr index that has a category field, and the
>> index will have between 1 and 2 million results
>> when we are done indexing.
>>
>> I'm going to spend some time today researching this.  If anyone can
>> send me some advice, I would be grateful.
>>
>> I've considered post processing the results, but I'm not sure if this
>> is the wisest plan.  And, I don't know how I would accurate
>> result counts, to do pagination.
>>
>> cheers
>
>

Reply via email to