: > being returned (consider the case where we are sorting in term order - once : > we have collected counts for ${facet.limit} constraints, we can stop : > iterating over terms -- but to compute the total umber of constraints (ie: : > terms) we would have to keep going and test every one of them against : > ${facet.mincount}) : > : I've been told this before, but it still doesn't really make sense to me. How : can you possibly find the top N constraints, without having at least examined : all the contraints? How do you know which are the top N if there are some you
that's exactly my point: in the scenerio where you've asked for facet.mincount=N&facet.limit=M&facet.sort=index you don't have to find hte "top" constraints, you just have to find the first M terms in index order that have a mincount of N. : But I may be missing something. I've examined only one of the code : paths/methods for faceting in source code, the one (if my reading was correct) : that ends up used for high-cardinality multi-valued fields -- in that method, : it looked like it should add no work at all to give you a facet unique value : (result set value cardinality) count. (with facet.mincount of 1 anyway). But : I may have been mis-reading, or it may be that other methods are more : troublesome. in any case where you ar sorting by *counts* then yes, all of the constraints have to be checked, so you can count them as you go -- but that doesn't scale in distributed faceting, you can't just add the counts up from each shard because you don't know what the overlap is -- hence my comment about how to dedup them. there are some simple usecases where it's feasible, but in general it's a very hard problem. -Hoss