Back in 2.0.4 or 2.0.5 I ran into a problem with delete-only workloads. If I 
did lots of deletes and no upserts, Cassandra would report that the memtable 
was 0 bytes because an accounting error. The memtable would never flush and 
Cassandra would eventually die. Someone was kind enough to create a patch, 
which seemed to have fixed the problem, but last night it reared its ugly head.

I’m now running 2.0.14. I ran a cleanup process on my cluster (10 nodes, RF=3, 
CL=1). The workload was pretty light, because this cleanup process is 
single-threaded and does everything synchronously. It was performing 4 reads 
per second and about 3000 deletes per second. Over the course of many hours, 
heap slowly grew on all nodes. CPU utilization also increased as GC consumed an 
ever-increasing amount of time. Eventually a couple of nodes shed 3.5 GB of 
their 7.5 GB. Other nodes weren’t so fortunate and started flapping due to 30 
second GC pauses.

The workaround is pretty simple. This cleanup process can simply write a dummy 
record with a TTL periodically so that Cassandra can flush its memtables and 
function properly. However, I think this probably ought to be fixed. 
Delete-only workloads can’t be that rare. I can’t be the only one that needs to 
go through and cleanup their tables.

Robert

Reply via email to