On Tue, May 15, 2012 at 3:19 PM, Yiming Sun <yiming....@gmail.com> wrote:
> Hello, > > I was reading the Apache Cassandra 1.0 Documentation PDF dated May 10, > 2012, and had some questions on what the recommended memory size is. > > Below is the snippet from the PDF. Bullet 1 suggests to have 16-32GB of > RAM, yet Bullet 2 suggests to limit Java heap size to no more than 8GB. My > understanding is that Cassandra is implemented purely in Java, so all > memory it sees and uses is the JVM Heap. > The main way that additional RAM helps is through the OS page cache, which will store hot portions of SSTables in memory. Additionally, Cassandra can now do off-heap caching. > So can someone help me understand the discrepancy between 16-32GB of RAM > and 8GB of heap? Thanks. > > == snippet == > Memory > The more memory a Cassandra node has, the better read performance. More > RAM allows for larger cache sizes and > reduces disk I/O for reads. More RAM also allows memory tables (memtables) > to hold more recently written data. Larger > memtables lead to a fewer number of SSTables being flushed to disk and > fewer files to scan during a read. The ideal > amount of RAM depends on the anticipated size of your hot data. > > • For dedicated hardware, a minimum of than 8GB of RAM is needed. DataStax > recommends 16GB - 32GB. > > • Java heap space should be set to a maximum of 8GB or half of your total > RAM, whichever is lower. (A greater > heap size has more intense garbage collection periods.) > > • For a virtual environment use a minimum of 4GB, such as Amazon EC2 Large > instances. For production clusters > with a healthy amount of traffic, 8GB is more common. > -- Tyler Hobbs DataStax <http://datastax.com/>