Hi,

We're running Solr 4.10.1 on Linux using Tomcat. Distributed environment,
40 virtual servers with high resources. Concurrent queries that are quite
complex (may be hundreds of terms), NRT indexing and a few hundreds of
facet fields which might have many (hundreds of thousands) distinct values.

We've configured a 6GB JVM heap, and after quite a bit of work, it seems to
be pretty well configured GC parameter-wise (we're using CMS and ParNew).

The following problem occurs -
Once every couple of hours, suddenly start getting
"concurrent-mode-failure" on one or more servers, the memory starts
climbing up further and further and "concurrent-mode-failure" continues.
Naturally, during this time, SOLR is unresponsive and the queries are
timed-out. Eventually it might pass (GC will succeed), after 5-10 minutes.
Sometimes this phenomenon can occur for a great deal of time, one server
goes up and then another and so forth.

Memory dumps point to ConcurrentLRUCache (used in filterCache and
fieldValueCache). Mathematically speaking, the sizes I see in the dumps do
not make sense. The configured sizes shouldn't take up more than a few
hunderds of MBs.

Any ideas? Anyone seen this kind of problem?

Reply via email to