Thank you Aaron. 8G memory is about the spec we use now for testing.
I observed a couple of other things when checked the output.log file but
I think this should go to another post.
Thank you very much for your advice.
Bill
On 13/04/12 02:49, aaron morton wrote:
It depends on a lot of things: schema size, caches, work load etc.
If your are just starting out I would recommend using a machine with
8gb or 16gb total ram. By default cassandra will take about 4gb or 8gb
(respectively) for the JVM.
Once you have a feel for how things work you should be able to
estimate the resources your application will need.
Hope that helps.
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 13/04/2012, at 2:19 AM, Vasileios Vlachos wrote:
Hello Aaron,
Thank you for getting back to me.
I will change to m1.large first to see how long it will take
Cassandra node to die (if at all). If again not happy I will try more
memory. I just want to test it step by step and see what the
differences are. I will also change the cassandra-env file back to
defaults.
Is there an absolute minimum requirement for Cassandra in terms of
memory? I might be wrong, but from my understanding we shouldn't have
any problems given the amount of data we store per day (currently
approximately 2-2.5G / day).
Thank you in advance,
Bill
On Wed, Apr 11, 2012 at 7:33 PM, aaron morton
<aa...@thelastpickle.com <mailto:aa...@thelastpickle.com>> wrote:
'system_memory_in_mb' (3760) and the 'system_cpu_cores' (1)
according to our nodes' specification. We also changed the
'MAX_HEAP_SIZE' to 2G and the 'HEAP_NEWSIZE' to 200M (we think
the second is related to the Garbage Collection).
It's best to leave the default settings unless you know what you
are doing here.
In case you find this useful, swap is off and unevictable memory
seems to be very high on all 3 servers (2.3GB, we usually
observe the amount of unevictable memory on other Linux servers
of around 0-16KB)
Cassandra locks the java memory so it cannot be swapped out.
The problem is that the node we hit from our thrift interface
dies regularly (approximately after we store 2-2.5G of data).
Error message: OutOfMemoryError: Java Heap Space and according
to the log it in fact used all of the allocated memory.
The easiest solution will be to use a larger EC2 instance.
People normally use an m1.xlarge with 16Gb of ram (you would also
try an m1.large).
If you are still experimenting I would suggest using the larger
instances so you can make some progress. Once you have a feel for
how things work you can then try to match the instances to your
budget.
Hope that helps.
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com <http://www.thelastpickle.com/>
On 11/04/2012, at 1:54 AM, Vasileios Vlachos wrote:
Hello,
We are experimenting a bit with Cassandra lately (version 1.0.7)
and we seem to have some problems with memory. We use EC2 as our
test environment and we have three nodes with 3.7G of memory and
1 core @ 2.4G, all running Ubuntu server 11.10.
The problem is that the node we hit from our thrift interface
dies regularly (approximately after we store 2-2.5G of data).
Error message: OutOfMemoryError: Java Heap Space and according
to the log it in fact used all of the allocated memory.
The nodes are under relatively constant load and store about
2000-4000 row keys a minute, which are batched through the Trift
interface in 10-30 row keys at once (with about 50 columns
each). The number of reads is very low with around 1000-2000 a
day and only requesting the data of a single row key. The is
currently only one used column family.
The initial thought was that something was wrong in the
cassandra-env.sh file. So, we specified the variables
'system_memory_in_mb' (3760) and the 'system_cpu_cores' (1)
according to our nodes' specification. We also changed the
'MAX_HEAP_SIZE' to 2G and the 'HEAP_NEWSIZE' to 200M (we think
the second is related to the Garbage Collection). Unfortunately,
that did not solve the issue and the node we hit via thrift
keeps on dying regularly.
In case you find this useful, swap is off and unevictable memory
seems to be very high on all 3 servers (2.3GB, we usually
observe the amount of unevictable memory on other Linux servers
of around 0-16KB) (We are not quite sure how the unevictable
memory ties into Cassandra, its just something we observed while
looking into the problem). The CPU is pretty much idle the
entire time. The heap memory is clearly being reduced once in a
while according to nodetool, but obviously grows over the limit
as time goes by.
Any ideas? Thanks in advance.
Bill
--
Kind regards,
Vasileios Vlachos