On Wed, 2017-10-04 at 21:42 -0700, S G wrote:
> The bit-vectors in filterCache are as long as the maximum number of
> documents in a core. If there are a billion docs per core, every bit
> vector will have a billion bits making its size as 10 9 / 8 = 128 mb

The tricky part here is there are sparse (aka few hits) entries that
takes up less space. The 1 bit/hit is worst case.

This is both good and bad. The good part is of course that it saves
memory. The bad part is that it often means that people set the
filterCache size to a high number and that it works well, right until
a series of filters with many hits.

It seems that the memory limit option maxSizeMB was added in Solr 5.2:
https://issues.apache.org/jira/browse/SOLR-7372
I am not sure if it works with all caches in Solr, but in my world it
is way better to define the caches by memory instead of count.

> With such a big cache-value per entry,  the default value of 128
> values in will become 128x128mb = 16gb and would not be very good for
> a system running below 32 gb of memory.

Sure. The default values are just that. For an index with 1M documents
and a lot of different filters, 128 would probably be too low.

If someone were to create a well-researched set of config files for
different scenarios, it would be a welcome addition to our shared
knowledge pool.

> If such a use-case is anticipated, either the JVM's max memory be
> increased to beyond 40 gb or the filterCache size be reduced to 32.

Best solution: Use maxSizeMB (if it works)
Second best solution: Reduce to 32 or less
Third best, but often used, solution: Hope that most of the entries are
sparse and will remain so

- Toke Eskildsen, Royal Danish Library

Reply via email to