Re: minpercentage vs. mincount

2010-06-02 Thread Chris Hostetter

: Obviously I could implement this in userland (like like mincount for 
: that matter), but I wonder if anyone else see's use in being able to 
: define that a facet must match a minimum percentage of all documents in 
: the result set, rather than a hardcoded value? The idea being that while 
: I might not be interested in a facet that only covers 3 documents in the 
: result set if there are lets say 1000 documents in the result set, the 
: situation would be a lot different if I only have 10 documents in the 
: result set.

typically people deal with this type of situation by using facet.limit to 
ensure they only get the top N constraints back -- and they set 
facet.mincount to something low just to save bandwidth if all the 
counts are too low to care about no matter how few results there are 
(ie: 0)

: I did not yet see such a feature, would it make sense to file it as a 
: feature request or should stuff like this rather be done in userland (I 
: have noticed for example that Solr prefers to have users normalize the 
: scores in userland too)?

feel free to file a feature request -- truthfully this is kind of a hard 
problem to solve in userland, you'd either have to do two queries (the 
first to get the numFound, the second with facet.mincount set as an 
integer relative numFound) or you'd have to do a single query but ask for 
a big value for facet.limit and hope that you get enough to prune your 
list.

Off the top of my head though: i can't relaly think of a sane way to do 
this on the server side that would work with distributed search either -- 
but go ahead and open an issue and let's see what the folks who are really 
smart about the distributed searching stuff have to say.


-Hoss



Re: minpercentage vs. mincount

2010-06-02 Thread Lukas Kahwe Smith
thx for your reply!

On 02.06.2010, at 20:27, Chris Hostetter wrote:

 feel free to file a feature request -- truthfully this is kind of a hard 
 problem to solve in userland, you'd either have to do two queries (the 
 first to get the numFound, the second with facet.mincount set as an 
 integer relative numFound) or you'd have to do a single query but ask for 
 a big value for facet.limit and hope that you get enough to prune your 
 list.

well i would probably implement it by just not setting a limit, and then just 
reducing the facets based on the numRows before sending the facets to the 
client (aka browser)

 Off the top of my head though: i can't relaly think of a sane way to do 
 this on the server side that would work with distributed search either -- 
 but go ahead and open an issue and let's see what the folks who are really 
 smart about the distributed searching stuff have to say.


ok i have created it:
https://issues.apache.org/jira/browse/SOLR-1937

regards,
Lukas Kahwe Smith
m...@pooteeweet.org





minpercentage vs. mincount

2010-05-26 Thread Lukas Kahwe Smith
Hi,

Obviously I could implement this in userland (like like mincount for that 
matter), but I wonder if anyone else see's use in being able to define that a 
facet must match a minimum percentage of all documents in the result set, 
rather than a hardcoded value? The idea being that while I might not be 
interested in a facet that only covers 3 documents in the result set if there 
are lets say 1000 documents in the result set, the situation would be a lot 
different if I only have 10 documents in the result set.

I did not yet see such a feature, would it make sense to file it as a feature 
request or should stuff like this rather be done in userland (I have noticed 
for example that Solr prefers to have users normalize the scores in userland 
too)?

regards,
Lukas Kahwe Smith
m...@pooteeweet.org