> > But in Cassandra output log : > > r...@cassandra-2:~# tail -f /var/log/cassandra/output.log > > INFO 15:32:05,390 GC for ConcurrentMarkSweep: 1359 ms, 4295787600 > reclaimed leaving 1684169392 used; max is 6563430400 > > INFO 15:32:09,875 GC for ConcurrentMarkSweep: 1363 ms, 4296991416 > reclaimed leaving 1684201560 used; max is 6563430400 > > INFO 15:32:14,370 GC for ConcurrentMarkSweep: 1341 ms, 4295467880 > reclaimed leaving 1684879440 used; max is 6563430400 > > INFO 15:32:18,906 GC for ConcurrentMarkSweep: 1343 ms, 4296386408 > reclaimed leaving 1685489208 used; max is 6563430400 > > INFO 15:32:23,564 GC for ConcurrentMarkSweep: 1511 ms, 4296407088 > reclaimed leaving 1685488744 used; max is 6563430400 > > INFO 15:32:28,068 GC for ConcurrentMarkSweep: 1347 ms, 4295383216 > reclaimed leaving 1686469448 used; max is 6563430400 > > INFO 15:32:32,617 GC for ConcurrentMarkSweep: 1376 ms, 4295689192 > reclaimed leaving 1687908304 used; max is 6563430400 > > INFO 15:32:37,283 GC for ConcurrentMarkSweep: 1468 ms, 4296056176 > reclaimed leaving 1687916880 used; max is 6563430400 > > INFO 15:32:41,811 GC for ConcurrentMarkSweep: 1358 ms, 4296412232 > reclaimed leaving 1688437064 used; max is 6563430400 > > INFO 15:32:46,436 GC for ConcurrentMarkSweep: 1368 ms, 4296105472 > reclaimed leaving 1691050032 used; max is 6563430400 > > INFO 15:32:51,180 GC for ConcurrentMarkSweep: 1545 ms, 4297439832 > reclaimed leaving 1691033816 used; max is 6563430400 > > INFO 15:32:55,703 GC for ConcurrentMarkSweep: 1379 ms, 4295491928 > reclaimed leaving 1692891456 used; max is 6563430400 > > INFO 15:33:00,328 GC for ConcurrentMarkSweep: 1378 ms, 4296657208 > reclaimed leaving 1694981528 used; max is 6563430400 > > Note that those are ConcurrentMarkSweep GC:s rather than ParNew:s, so > should be running concurrently with the application and should not > correlate to 1.3 second pauses for the application.
When I have this behaviour (ConcurrentMarkSweep, high CPU...) Cassandra is running but there is no write, no read since hours... (I stopped read & writes when the behaviour started). Even after a wipe of data on all nodes, the behaviour started to happen again after some hours of writing... :-( > As for the discrepancy between nodes, are all nodes handling a > similar > amount of traffic? I briefly checked your original post and you said > you're doing TimeUUID insertions. I don't remember off hand, and a > quick google didn't tell me, whether there is something specialy > about > the TimeUUID type that would prevent it - but normally if you're > using > an OrderedPartitioner you may simply be writing all your data to a > single node for token space division reasons and the fact that > timestamps are highly ordered. Theorically yes. But in fact, this behaviour happens first to heavier nodes (those which have the more important quantity of data). > How big a latency are we talking about in the cases where you're > timing out (i.e., what's the timeout)? Were the timeouts on reads, > writes or both? It's TimeOutExceptions on write (using C++ code -> thrift -> cassandra). This cluster is used at 99% to handle writes. How could I get/mesure latency ? Olivier