Configuration for large number of inserts

Adam Briffett Thu, 24 Mar 2011 04:18:03 -0700

Hi,

When doing bulk inserts of data using Pelops (~1000 million rows,
column counts varying from 1 - 100,000 but more skewed towards fewer
columns), we're ultimately getting a server OOME using 0.7.4. I've
attempted to follow other pointers on this issue (reducing threshold
before memtables flushed to disk, increasing heap space), turned off
compaction (although it seems to still be happening), and also tried
reducing the value of index_interval to avoid using up space.


We're using a single box for this test with attached .yaml and a heap
size of 2GB, we're also using a single keyspace and column family, the
settings for which are below (we're creating it using Pelops rather
than in the .yaml):

cf.column_type = "Standard";
cf.comparator_type = "UTF8Type";
cf.key_cache_size = 200d;
cf.row_cache_size = 16d;
cf.memtable_throughput_in_mb = 128;
cf.memtable_operations_in_millions = 0.3;
cf.min_compaction_threshold = 0;
cf.max_compaction_threshold = 0;

One issue is that compaction still appears to be happening, as if I
check using nodetool compactionstats there are minor compactions
piling up (also these get into the thousands, it seems they're being
created faster than they can be addressed)

Can anyone suggest anywhere we might be going wrong? As I say, at the
present we're just looking to do a bulk insert, no read activity until
the writes have completed.

Thanks in advance,

Adam

cassandra.yaml
Description: Binary data

Configuration for large number of inserts

Reply via email to