:        DocSet valueDocSet = req.getSearcher().getDocSet(item.getQuery());
:        long count = valueDocSet.intersectionSize(results);
:
: Is this the preferred way to obtain such a count, or ithere another

that's a very good way to do it.  You could also use
SolrIndexSearcher.numDocs -- it is esentially the same thing, but in the
future there may be optimizations that can be done to eliminate the
construction of one DocSet (if the other one already exists)

: way, such as dealing directly with BitSets (something I avoided, since
: it appears getBits() is deprecated in the DocSet interface)?

avoid the BitSets -- Solr actuallly doesn't use them nativaly at all any
more (take a look at the OpenBitSet class) and even then small DocSets
don't have one -- a HashDocSet is used instead.

: Similarly, since this method is commented as "cache-aware", does that
: mean that the item itself does not need to worry about caching its
: results, only its terms, since the results will end up in the
: queryResultCache?  Or is this assumption incorrect, and should each
: facet/item be concerned with caching its results as well?

You should be able to avoid worrying about caching the results of your
individualt facets: the filterCache will take care of that.


: general.  To that end, my current structure defines:
:
: - a <facetHandler/> entry in solrconfig.xml, the only current
: implementation of which loads a set of Facet definitions from an xml
: file.
: - each Facet contains an id for lookups and a List of FacetItems (some
: statically configured, some constructed dynamically from available
: Terms, though not backed by any cache yet.)
: - each FacetItem contains a displayName and Query (and associated queryString)

I'm not sure i understahnd what exactly the "facetHandler" registration
gains you that you couldn't have achieved in a custome requestHandler
(without needing to modify the internals/config parrsing and so on) ...
your custom request handler could take in a "FacetHandler" class name as
an init param, or it could have taken in the XML information directly as
deeply nested set of init params. am i missing something else?

: The basic handling and output format work for my prototype's purposes,
: but I have not delved deeply into caching at this time. Does this
: setup seem appropriate, and the abovementioned caching assumption seem
: valid, or have I missed something that would help support facets on a
: larger scale?

i think you are definitely on the right track.  The best way to understand
how the built in caching impacts your usage is to take a look at the
statistics screen linked to from the admin page (i think that's what it's
called, i have limited computer access at the moment) ... it will show you
the cache inserts/lookups for each of your configured caches.  for
DocSets, take a look at the filterCache ... try various queries and watch
the number of inserts/lookups as the results of the individual facets are
reused over and over.  if you set the size of the filterCache big enough,
and you have a lot of facets, you can easily see your cache hitratio above
99%


-Hoss

Reply via email to