I apologize in advance for what's probably a foolish question, but I'm trying to get a feel for how much memory a properly-configured Solr instance should be using.
I have an index with 2.5 million documents. The documents aren't all that large. Our index is 25GB, and optimized fairly often. We're consistently running out of memory. Sometimes it's a heap space error, and other times the machine will run into swap. (The latter may not be directly related to Solr, but nothing else is running on the box.) We have four dedicated servers for this, each a quad Xeon with 16GB RAM. We have one master that receives all updates, and three slaves that handle queries. The three slaves have Tomcat configured for a 14GB heap. There really isn't a lot of disk activity. The machines seem underloaded to me, receiving less than one query per second on average. Requests are served in about 300ms average, so it's not as if we have many concurrent queries backing up. We do use multi-field faceting in some searches. I'm having a hard time figuring out how big of an impact this may have. None of our caches (filter, auto-warming, etc.) are set for more than 512 documents. Obviously, memory usage is going to be very variable, but what I'm wondering is: a.) Does this sound like a sane configuration, or is something seriously wrong? It seems that many people are able to run considerably larger indexes with considerably less resources. b.) Is there any documentation on how the memory is being used? Is Solr attempting to cram as much of the 25GB index into memory as possible? Maybe I just overlooked something, but I don't know how to begin calculating Solr's memory requirements. c.) Does anything in the description of my Solr setup jump out at you as a potential source of memory problems? We've increased the heap space considerably, up to the current 14GB, and we're still running out of heap space periodically. Thanks in advance for any help! -- Matt Wagner