hey guys, I'm running a solrcloud cluster consisting of five nodes. My largest index contains 2.5 million documents and occupies about 6 gigabytes of disk space. We recently switched to the latest solr version (4.10) from version 4.4.1 which we ran successfully for about a year without any major issues. >From the get go we started having memory problems caused by the CMS old heap usage being filled up incrementally. It starts out with a very low memory consumption and after 12 hours or so it ends up using up all available heap space. We thought it could be one of the caches we had configured, so we reduced our main core filter cache max size from 1024 to 512 elements. The only thing we accomplished was that the cluster ran for a longer time than before.
I generated several heapdumps and basically what is filling up the heap is lucene's field cache. it gets bigger and bigger until it fills up all available memory. My jvm memory settings are the following: -Xms15g -Xmx15g -XX:PermSize=512m -XX:MaxPermSize=512m -XX:NewSize=5g -XX:MaxNewSize=5g -XX:+UseParNewGC -XX:+ExplicitGCInvokesConcurrent -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC What's weird to me is that we didn't have this problem before, I'm thinking this is some kind of memory leak issue present in the new lucene. We ran our old cluster for several weeks at a time without having to redeploy because of config changes or other reasons. Was there some issue reported related to elevated memory consumption by the field cache? any help would be greatly appreciated. regards, -- Luis Carlos Guerrero about.me/luis.guerrero