Re: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req

Jonathan Rochkind Thu, 22 Jul 2010 14:15:38 -0700

Chris Hostetter wrote:

computing the number: in some algorithms it's relatively cheap (on asingle server) but in others it's more expensive then computing the facetcounts being returned (consider the case where we are sorting in termorder - once we have collected counts for ${facet.limit} constraints, wecan stop iterating over terms -- but to compute the total umber ofconstraints (ie: terms) we would have to keep going and test every one ofthem against ${facet.mincount})

I've been told this before, but it still doesn't really make sense tome. How can you possibly find the top N constraints, without having atleast examined all the contraints? How do you know which are the top Nif there are some you haven't looked at? And if you've looked at themall, it's no problem to increment at a counter as you look at each one.Although I guess the facet.minCount test does possibly put a crimp inthings, I don't ever use that param myself to be something other than 1,so hadn't considered it.

But I may be missing something. I've examined only one of the codepaths/methods for faceting in source code, the one (if my reading wascorrect) that ends up used for high-cardinality multi-valued fields --in that method, it looked like it should add no work at all to give youa facet unique value (result set value cardinality) count. (withfacet.mincount of 1 anyway). But I may have been mis-reading, or it maybe that other methods are more troublesome.

At any rate, if I need it bad enough, I'll try to write my own facetcomponent that does it (perhaps a subclass of the existing SimpleFacet),and see what happens. It does seem to be something a variety ofpeople's use cases could use, I see it mentioned periodically in thelist serv archives.


Jonathan

Re: Finding distinct unique IDs in documents returned by fq -- Urgent Help Req

Reply via email to