Re: [Ferret-talk] Grouping with ferret

Jens Kraemer Tue, 29 Jul 2008 02:41:37 -0700

Hi!

On 28.07.2008, at 13:29, Henrik wrote:

Hi list,

I have a problem grouping with ferret.

I'm using the filter_proc from Dave's book as seen below

results = {}
group_by_proc = lambda do |doc_id, score, searcher|
        doc = searcher[doc_id]
                (results[doc[:pk_file_id]]||=[]) << doc[:filename] << doc[:path]
        next true
end


The problem is that if I use this it ignores my limit clause.
I set limit on 10 and I still get 5995 results and it takes severalseconds.
How come the limit clause is ignored when using a filter_proc? Howcan I change this behaviour?

Filters are applied by Ferret before the result is limited, that's whyyour filter gets to see all possible results regardless of the limityou specify. If it was implemented the other way around, firstlimiting and then filtering, you would possibly end up with less thanlimit results in case your filter would actually filter out anyresults. Of course in your case this wouldnt happen as your filterdoes no filtering but always returns true.

If you really only want the first 10 results, why dont you just usethe results you get back and do your result

collecting there like this?

results = {}

hit_count = index.search_each(query, :limit => 10, :filter_proc =>group_by_proc) do |doc, score|

  (results[doc[:pk_file_id]]||=[]) << doc[:filename] << doc[:path]
end

You could of course also return false in your filter_proc for everypossible hit once your results collection has reached the desired sizeto save the time collecting all results.


cheers,
Jens


--
Jens Krämer
Finkenlust 14, 06449 Aschersleben, Germany
VAT Id DE251962952
http://www.jkraemer.net/ - Blog
http://www.omdb.org/     - The new free film database

_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Re: [Ferret-talk] Grouping with ferret

Reply via email to