On 5/1/2014 3:03 PM, Aman Tandon wrote:
> Please check that link
> http://wiki.apache.org/solr/SimpleFacetParameters#facet.method there is
> something mentioned in facet.method wiki
>
> *The default value is fc (except for BoolField which uses enum) since it
> tends to use less memory and is faster then the enumeration method when a
> field has many unique terms in the index.*
>
> So can you explain how enum is faster than default. Also we are currently
> using the solr 4.2 does that support this facet.method=enum, if not then
> which version should we pick.
>
> We are planning to move to SolrCloud with the version solr 4.7.1, so does
> this 14 GB of RAM will be sufficient? or should we increase it?

The fc method (which means fieldcache) puts all the data required to
build facets on that field into the fieldcache, and that data stays
there until the next commit or restart.  If you are committing
frequently, that memory use might be wasted.

I was surprised to read that fc uses less memory.  It may be very true
that the amount of memory required for a single call with
facet.method=enum is more than the amount of memory required in the
fieldcache for facet.method=fc, but that memory can be recovered as
garbage -- with the fc method, it can't be recovered.  It sits there,
waiting for that facet to be used again, so it can speed it up.  When
you commit and open a new searcher, it gets thrown away.

If you use a lot of different facets, the fieldcache can become HUGE
with the fc method.  *If you don't do all those facets at the same time*
(a very important qualifier), you can switch to enum and the total
amount of resident heap memory required will be a lot less.  There may
be a lot of garbage to collect, but the total heap requirement at any
given moment should be smaller.  If you actually need to build a lot of
different facets at nearly the same time, enum may not actually help.

The enum method is actually a little slower than fc for a single run,
but the java heap characteristics for multiple runs can cause enum to be
faster in bulk.  Try both and see what your results are.

Thanks,
Shawn

Reply via email to