: DocSet valueDocSet = req.getSearcher().getDocSet(item.getQuery()); : long count = valueDocSet.intersectionSize(results); : : Is this the preferred way to obtain such a count, or ithere another
that's a very good way to do it. You could also use SolrIndexSearcher.numDocs -- it is esentially the same thing, but in the future there may be optimizations that can be done to eliminate the construction of one DocSet (if the other one already exists) : way, such as dealing directly with BitSets (something I avoided, since : it appears getBits() is deprecated in the DocSet interface)? avoid the BitSets -- Solr actuallly doesn't use them nativaly at all any more (take a look at the OpenBitSet class) and even then small DocSets don't have one -- a HashDocSet is used instead. : Similarly, since this method is commented as "cache-aware", does that : mean that the item itself does not need to worry about caching its : results, only its terms, since the results will end up in the : queryResultCache? Or is this assumption incorrect, and should each : facet/item be concerned with caching its results as well? You should be able to avoid worrying about caching the results of your individualt facets: the filterCache will take care of that. : general. To that end, my current structure defines: : : - a <facetHandler/> entry in solrconfig.xml, the only current : implementation of which loads a set of Facet definitions from an xml : file. : - each Facet contains an id for lookups and a List of FacetItems (some : statically configured, some constructed dynamically from available : Terms, though not backed by any cache yet.) : - each FacetItem contains a displayName and Query (and associated queryString) I'm not sure i understahnd what exactly the "facetHandler" registration gains you that you couldn't have achieved in a custome requestHandler (without needing to modify the internals/config parrsing and so on) ... your custom request handler could take in a "FacetHandler" class name as an init param, or it could have taken in the XML information directly as deeply nested set of init params. am i missing something else? : The basic handling and output format work for my prototype's purposes, : but I have not delved deeply into caching at this time. Does this : setup seem appropriate, and the abovementioned caching assumption seem : valid, or have I missed something that would help support facets on a : larger scale? i think you are definitely on the right track. The best way to understand how the built in caching impacts your usage is to take a look at the statistics screen linked to from the admin page (i think that's what it's called, i have limited computer access at the moment) ... it will show you the cache inserts/lookups for each of your configured caches. for DocSets, take a look at the filterCache ... try various queries and watch the number of inserts/lookups as the results of the individual facets are reused over and over. if you set the size of the filterCache big enough, and you have a lot of facets, you can easily see your cache hitratio above 99% -Hoss