I am observing some weird behavior with how Solr is using memory.  We are
running both Solr and zookeeper on the same node.  We tested memory
settings on Solr Cloud Setup of 1 shard with 146GB index size, and 2 Shard
Solr setup with 44GB index size.  Both are running on similar beefy
machines.

 After running the setup for 3-4 days, I see that a lot of memory is
inactive in all the nodes -

 99052952  total memory
 98606256  used memory
 19143796  active memory
 75063504  inactive memory

And inactive memory is never reclaimed by the OS.  When total memory size
is reached, latency and disk IO shoots up.  We observed this behavior in
both Solr Cloud setup with 1 shard and Solr setup with 2 shards.

For the Solr Cloud setup, we are running a cron job with following command
to clear out the inactive memory.  It  is working as expected.  Even though
the index size of Cloud is 146GB, the used memory is always below 55GB.
Our response times are better and no errors/exceptions are thrown. (This
command causes issue in 2 Shard setup)

echo 3 > /proc/sys/vm/drop_caches

We have disabled the query, doc and solr caches in our setup.  Zookeeper is
using around 10GB of memory and we are not running any other process in
this system.

Has anyone faced this issue before?

Reply via email to