fwiw, Facets are much less heap greedy when counted for docValues enabled fields, they should not hit UnInvertedField in this case. Try them.
On Thu, Apr 10, 2014 at 8:20 PM, Toke Eskildsen <t...@statsbiblioteket.dk>wrote: > Shawn Heisey [s...@elyograg.org] wrote: > >On 4/9/2014 11:53 PM, Toke Eskildsen wrote: > >> The memory allocation for enum is both low and independent of the amount > >> of unique values in the facets. The trade-off is that is is very slow > >> for medium- to high-cardinality fields. > > > This is where it is extremely beneficial to have enough RAM to cache > > your entire index. The term list must be enumerated for every facet > > request, but if the data is already in the OS disk cache, this is very > > fast. > > Very fast compared to not cached, yes, but still slow compared to fc, for > high-cardinality. The processing overhead per term is a great deal larger > for enum. I recently ran some tests with Solr's different faceting methods > for 50M+ values, but stopped measuring for enum as it took so much longer > than the other methods. For a fully cached index. > > > If facets are happening on lots of fields and are heavily utilized, > > facet.method=enum should be used, and there must be plenty of RAM to > > cache all or most of the index data on the machine. > > I do not understand how the number of facets has any influence on the > choice between enum and fc. As Solr (sadly) does not support combined > structures for multiple facets, each facet is independent from the others. > Shouldn't the choice be done for each individual facet? > > - Toke Eskildsen > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>